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and bias in test use,i The author states. that conclusiccs re 
the validity and bias of the WISC-R with minorities vary de 
the definition of bias, A definition is proposed based cn t 
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assessnent are identified, inviludilig good fundamentals and 
practices, clarification of purpose, and multifactored asse 
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-ir^^iewea, followed bV" a discussion of assessment of adaptiv 
with mentally retarded persons^ Reqailre'ments of P.L,. 9t«-1U2 
Education for All Handicapped Childrj^n Act) that tLs child* 
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are considered. The continuing probljem 6ver defining mild m 
retardation is examined. The author iconc^ludes by emphasizin 
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. ' * -NONBIASED ASSESSMENT- 

Daniel J . Reschly ^ 
Iowa State University 

THE PROBLEM 

When in danger, when in doubt, 
Run ih circles 
Yell and shout. ^ 

The issues of bias in tests and in assessnent have provoked high-frequency be- 
haviors of the type' suggested in the, anonyi!t)us saving quoted here. Much heat has 
__been generated throug h the yelling and shouting, but relatively little light. II- 
iumination of inproved practices in psychology and education, especially procedures ' 
that would expand opportunities and itcprove competencies for children, have been 
conspicuously absent in inost of the discussions. 

Perhaps the main difficulty steins from a focus on the wrong problems and the . 
wrong questions in the discussion of nonbiased assessment. The major concern has 
been with the assessment of minorities, particularly questions related to whether 
specific tests are biased or unfair when used with black. Latino, or Native American 
children. The issues related to the use of tests with children from minority back- 
grounds are legitimate and itiportant to raise. However, a more significant issue to 
address is whether we can ensure educational experiences that maximize competencies 
and opportunities for minority students. 

Several of the wide assortment of definitions and criteria for determining bias 
in tests or assessment are discussed and eyaluat^ed in this paper. Although each of 
these conceptions has merit, a more conprehensive view of bias in assessment is pro- 
posed. Bias in tests, or bias in assessment generally, should be evaluated accord- 
ing to the criterion of outcomes for indiv iduals . The concern for oufcomss for in- 
dividuals directs our^efforts toward" ensuring that assessment activities yield in- 
formation useful for educational and psychological interventions, and toward the ef- 
fectiveness of these interventions. 

Effective ^solutions to the challenges posed >y nonbiased assessment will not be 
found simply, in new tests or rWisions in present tests. There are no culture free 
or, fair testsi Better assessment will be part of an effectivje response, but this 
alone is not the answer • Further, other solutions such as scrupulous avoidance of 
overrepresentation of minorities in special education may satisly certain external 
agencies, but this too is an ineffective solution. 

Effective solutions are possit>le only throu^ recognition of the larger problem. 

- .The critical issue is the quality and effectiveness of the educational services pro- 
vided to economically disadvantaged students. Our part of the problem as special 
education and related services personnel is the quality and usefulness of special 
education services provided to economically disadvantaged persons referred to special 

^ services • - _ / 

It is this group, d.e. , economically disadvantaged students teferred for special 
services, that has received an enormous amount /of attention in recent years. The dis 
cussions have been heated and controversial. Economically disadvantaged ^students, 
often with minority status, have been and are placed in special education programs at 
a rate that is disproportionate to their numbers in the total population. This over- 
representation has been the subject of extensive litigation, le^slation> and Federal 
= Office for Civil Rights activities. \. ^ ' ' 

ERIC Understanding the litigation, legislation, and federal cbii5)liance activities is 
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important in developing effective responses to the challenge of nonbiased assessnent. 
The implicit assumptions in the litigation must be understood in order to establish 
open communication, and to identify practices in need of reform*, Finally, clarifi- 
cation of the iii?)licit assumptions leads to the view of nonbiased assessment as a , 
process rather than a magici test or simple avoidance of overrepresentation. 

LEGAL REQUIREMENTS 

Bersoff (1979) has provided a comprel^^.nsive review of the evolution o^ judicial 
examination of psychological assessment. The courts have had an enormous influence 
on psychological assessment and, special Education services. Nearly all, of the major 
principles codified in legislation of the mid and late 1970s appeared earlier in judi- 
cial opinions^ or consent decreed (Xurnbull,. 1978) Appropriate assessment and appro- 
priate educational services for economically disadvantaged minority students were 
among the most important issues in litigation in .the early 1970s. 

Litigation ' , ' . 

Diana and Guadalupe Cases > Two cases in the early 1970s "involved nearly identical 
issues concerning psychological assessment and special education services for tlie mild^ 
ly retarded. The Diana (Diana v State of California, 1970) and Guadalupe (Guadalupe v 
Tempe Elementary District, 1972) cases were filed as class action suits on behalf of 
minority/bilingual students placed in programs for the Educable (mild) Mentally Re- 
tarded (EMR) . In both cases plaintiffs presented evidence indicating overrepresenta- 
tion of minority/bilingual students in EMR programs. For example, in the Diana Case 
the enrollment of Hispanic students in Monterey County California was 18.5% of the 
total enrollment, but one-third of the students in EM special classes w^re Hispanic. 
This overrepresentation was viewed as promoting segregation and in violation of Four- 
teenth Amendment rights to equal protection of the laws. Conventional psychological 
assessment practices, particularly intelligence tests, were regarded bv the plaintiffs 
(and apparently the courts) as the major cause of the overrepresentation. 

Both cases were resolved through consent decrees negotiated between plaintiffs 
and defendants^ and then approved by the courts* The consent decrees specified a num- 
ber of reforms in psychological assessnent practices including the following: Assess- 
ment of primary language competence, and administration and interpretation of tests 
in a manner consistent with the child *s primary language; emphasis on nonverbal or 
performance tests in classification decisions with bilingual students; and immediate 
reevaluation of students who may have been misplaced. In addition the Guadalupe con- 
sent decree lowered the IQ cut off for classification/ placement decisions; required 
assessment of adaptive behavior outside of school; and required that intelligence test 
results not be the exclusive or primary basis for classifying children as mildly re- 
tarded in the public schools. Implicit in both cases were the assumptions that intel- 
ligenc"^e tests, especiallaj verbal tests, were biased against bilingual st;udents and 
that special class programs for the mildly retarded were ineffective and' stigmtizing. 

Larry P. v Riles (1972, 197A^ & j^^^^ The Larry P. case was a class actio'a suit 
related to the basic "issue of overrepresentation of mii^ority students in projjrams for 
the mildly retarded. Larry ?♦ was filed on behalf of Black children placed in programs 
for the mildly retarded. The case was filed originally in November, 1971; an injunc- 
tion was issued by the^ Federal District Court for Northern California in June, 1|72; 
an expanded injunction was issued in 1974; the case was in trial from October, 1977 to 
May, 1978; and an opinion was issued by Judge Peckam in October, 1979. The Larry P . 
case has already been before the courts for nearly a decade. Appeal of the decision 
pethaps to, the U.S. Supreme Court is considered likely. The Larry JE^. trial generated 
a 10,000 page. transcript much of which came from expert witnesses^for the plaintiffs 
and defense. 

8 
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The preliminary injunction in Larry P. restrained the defendants (of f icials of 
.the San Francisco Public Schools and the California State Department of Education) 
from 

"placin^Black students in classes for- the educable 
mentally retarded on the basis of criteria which 
place primary reliance on the results of IQ tests 
as they are currently administered, Jif (emphasis 
added) the consequence of use of such criteria is 
racial imbalance in the composition of such classes" 
(Larry P. v Riles Court Injunction, 1972). 

In 1974 the plaintiffs obtained an expansion of the injunction to all school districts 
in California. The 1979 court opinion also placed a b^n on the use*of intelligence 
tests with black students. The key statement in the decision which was 131 pages in 
length was 

"Defendants are enjoined from utilizing, permitting the 
use of, or a^pproving the oi^e of any standardized intel- 
ligence tests, for the identification of black 
E.M.R. children or their placement into E^M.R. classes, 
Vithout securing prior approval by this court" (Larry P. 
V Riles, -1979, p. 104) . 

/ 

The implications of the Larry P. opinion for school psychology and special educa- 
tion are unclear, but potentially enormous (See School Psychology R eview , Vol^, No. 2, 
1980). In my view, the court, through the plaintiffs actions, identified a signif- 
icant problem; namely, the appropriateness of segregated special classes for "six hour" 
retarded children. The opinion, however, is an instance of Right Problem-W rong Solu- 
tion.' A number ^of underlying assumptions^are apparent in Judge Peckam's opinion. 
. These assumptions and the issues they represent are probably more important in develop- 
ing solutions to the problem of appropriate education for all children than the narrow 
issue of potential bias in IQ tests (see later sections). 

PASE vs Hannon (1980) . A recent decision from a Federal District Court in Illi- 
nois addressed the same is5.ues as previous placement bi^s cases, but reached a mark- 
edly different decision. Again, the primary issue was alleged bias in intelligence 
tests. In contrast to previous decisions such as LWrry P . , the judge concluded that 
very f jw items on conventional tests were biased and that other sources of information 
were just as important as test scores in classification/placement decisions. 

1 • - ' 

In vi^ew of the recent PASE Opitj;Lon, and the expected appeals in both PASE and 

tarry P., the present legal situation is highly ambiguous. Appeals typically are very 

time consuming. Both cases may reach tlie U.S. Supreme Court in the mid to late 1980s. 

Resolution of the question of bias through the courts has not been possible to date for 

many reasons. Perhaps the most important reason Is the inconsistent conceptions and 

evidence on bias as well as the inherent nature of the judicial process (See later 

sections) . * - 

Implicit Issues in Litigation 

Although the litigation concerning overrepresentation of minorities in special 
class programs for the mildly retarded focused on alleged bias in intelligence tests, 
a number of implicit assumptions were made by the plaintiffs and accepted by the courts. 
These assumptions represent unresolved issues in the professional literature, and are 
more important to the provision of fair and efl active services to children, than the 
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narrow (and perhaps unresolvable) issue of bias in intelligence tests. Examination 
of these assumptions provides a better perspective on recent legislation as well as 
suggestions for different approaches to the problems of bias in assessment and appro- 
priate classification/placement decisions with minority students. , 

Nature-Nurture . The debate over the relative effects of heredity and environ- 
vD&nt in determining intelligence predates the development of measures of intelligence. 
This very old-debate has not been resolved and is not likely t^be resolved in the 
forseeable future. The controversy was increased dramatically in, the 1970s with the 
extension of the hereditarian view to explain differences between, racial groups 
(Jensen, 1969). The vj.aws of other participants in the debate were sometimes in- 
flammatory (ShocKley, 1971) , and interpreted as stemming frow frankly racist motives. 
The reaction of marLy!, including psychologists and *J:he lay public, black as well as 
white, was to denounce these widely .publicized theories and interpretations of equiv- 
ocal, evidence. Of particular importance to school psychologists were the vehement 
attacks on intelligence tests that were prompted by the suggestions that racial*dif- 
ferences were due. to hereditary factors. These reactions appeared in the literature, 
W were undoubtedly a j)r,incipal factor leading to the Larry P. and other court ac- 
tion&r^The available evidence on differences kmong races is equivocal. Strong 
heredit^ian, strong environmental, or interactionist positions have been supported — 
by citing evidence (Loehlin, LindzeV, & Spuhler, 1975). The debate is therefore un- 
likely to be resolved through conventional empirical methods. The alternative ap- 
parently chosen by the critics was to force a kind of resolution through legal 'pro- 
cedures. The ban on .the use of ability tests with black student? in California from 
the Larry °P. decision might be extended to other locations and to other minority 
groups. Use of intelligence tests jaith minorities might be severely restricted in 
the future, but it is Unlikely that even this radical step would end ox lead to re- 
solution of the nature-nurture debate. Moreover, eliminating the .use of ability tests 
with minorities would accomplish little if anything toward elimination pf existing bar- 
riers to the full participation of all persons in the economic and social order, and 
would likely be counterproductive in that effort. Although not mentioned explicitly, 
the nature-nurture issue was a crucial factor in the Larry P. litigation. 

Meaning of IQ Test Results . A nutnber of myths regarding the- meaning of intel- 
ligence test TiiUTtThave been around for several decades. Of particular concern are 
the beliefs that IQ test results are predetermined by genetic factors, that intelli- 
gence is unitary and is measured dir.ectly by IQ tests, and that IQ test results are 
fixed. The available evidence clearly irefutes these myths (Hunt, 1961; Reschly, in 
1980 ) . and the vast majority of professional' psychologists do not harbor misconcep- 
tions. Kaufman-(lp79a) provided an excellent' discussion of the assumptions underlying 
ana the meaning of intellectual assessment. His views are. probably typical of most pro- 
fessional psychologists, l However, many consumers of IQ test results such as teachers, 
parents,. and the lay public generally hold these misconceptions." Recent suggestions 
to change the term IQ to School Functioning Level (Mercer, m9 ) or Academic Aptitude 
(Reschly, 1979) are designed to, reduce these misconceptions. 



A sigviificant- portion cf the testimony in Larry P. was devoted to disproving these 
mvths. This testimony has a "straw man" quality. The fact these myths were an im- 
plicit issue In the litigation provides further evidence for the need to clarify the 
meaning of IQ test results, and perhaps, renaming the construct. 

\ 

Labeling Effects . Implicit in all of the litigation was the assumption .that 
classification as Educable Mentally Retarded was stigmatizing and humiliating wit^ 
probable 'permanent effects. The controversy over labeling is far from resolved. The 
available empirical evidence does not .support the self-fulfilling prophecy notion and 
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direct effects of labels on the behavior of children or adults have been difficult to 
document (KacMillan, Jones, & Aloia, 1974) ♦ The dilemmas associated with classifica- 
tion have been prominent in the special education and school psychology literature 
for the past decade. Much of the discussion has not been guided by empirical data. 
The dilemma was described well by Gallagher (1972) who acknowledged the inevitability 
of classification, but suggested that the crucial factor was whether the benefits of 
services provided as a result of the label were sufficient' to jirfstify the possible 
risks of theJLabel. This risks/benef itc criterion should guide our efforts in the 
future to dedl with this issue. 

Meaning of Mild Men tal_ Ret ardat ion. The reasoning of the Larry P. decision was 
that the plaintiffs were not ""tlruly retarded" despite low IQs, low academic achieve- 
ment, and teacher referral. The effort to identify "true" mental retardation appears 
to be related to confusion of- mild with the more severe levels of mental retardation, 
JThe criteria for "true" mental retardation are apparently believed to require compre- 
hensive incompetence, permanence, and ^evidence of biological anomaly (Mercer, 1973 ; 
1979 ). In co^itrast, "the AAMD classif legation system does not specify etiology or 
prognosis. In additiofi, different domains of adaptive behavior are emphasized de- 
pending on the age of the individbal. There was little doubt that the plaintiffs in 
the placement litigation had serious academic problems. The que^ition was whether 
they were "truly" retarded, or whether they merely "performed within the retarded rang 
due to biases in the ICj tests. Cinfusion over the meaning of mild mental retardation" 
and questions concerning the criteria for adaptive behavior were key issues in the 
cases (See later section). \^ 

Ef ficaqy of S peeia l C lashes. The efficacy of special classes for the mildly re- 
tar deTwaT challenged forc'efully inUhe 1960s (e.g.-, Dunn, 196^. The lack of clear 
evidence to support the effectiveness of special classes along with the allegations 
concerning the negative effects of iabels created a difficulp situation for the de- 
fendants (school districts and stated departments of educatiofO in the placement liti- 
gation. . Further, the overrepresentatioji of minorities in segregated special classes 
raised questions about segregation of ^student groups by racie* In several instances 
the school districts and state departments of education did not defend their programs 
in\court. Consent agreements were negotiated out of~court. In the Larry case a 
deifense of the progran^ was attempted, , but unsuccessfully. It should.be noted that if 
the special 'class educational programs were as poor as alleged, then no child regard- 
less of race _or social class should be place d JLn such programs . The crucial issue, 
but implicit in the litigation, was effectiveness of special class programs, Unfor- 
' tunately, the plairififfs and courts seemed to focus on the criteria for placement of 
students rather than the effectiveness of the programs as siich. Additional research 
on the effectiveness of special education programs using longitudinal designs is 
clearly needed. J I 

Meaning of Bias '> Many definitions of bias in tests have been proposed in the 
psychological and educational measurement literature (see later section, this paper). 
Two criteria for bias have been implicitly a\:cepted by the courts. 

In all of the placement litigation the plaintiffs presented evidence on over- 
representation of minorities in special education programs. The overrepresentation 
data bear closer analysis. In many cases thWe data may have been misunderstood. 
For example when the* original Larry P. case court injunction was expanded in 1974, 
the percentage of black students in the San Fra\icisco schools was 30%, but the en- 
rollment in programs for the Educable Mentally I^etarded (EMR) was 60% black. The 
comparable stafe wide figures in California for ^the past ten years have been approx- 
imately 10 ^nd 23 per cent respectively for total black student enrollment and black 
student enrollment in EMR special classes. Quitd clearly, black students have been 
oyerrepresented. However, these figures have sometimes been understood to indicate 
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that many if not a majority of blacks were diagnosed as mentally retarded allegedly , 
because of biased IQ tests. For exampile, in the Larry P. opinion these percentages 
were characterized as beipg "grossly disproportionate" (p. 94) and as indicating 

overwhelming disproportions" (p. 101). However, over the past decade the percentage 
of the total black student population in California placed in EMR special classes has* 
varied from about 3.2% in 1968-69 to about l.i% in l^Te 77 (Reochly, 1980). Oiily a * 
very small percentage of minorities have been placed i^n Special class programs even 
in cases that appear to reveal very high overrepresenta^ion- These data certainly do 
not support the notioh that IQ tests have a pervasive deleterious effect on black chil- 
dren. / .^j 

The possible causes of the overrepresentation and other factors associated with 
overrepresentation should also be considered. ^The overrepresentation of males in pro- 
grams for the /nildly handicapped (i.e., learning disability, behavior disorders, and 
mild mental^ retardation) is greater than th^ overrepresentation o£ minorities. The 
overrepresentation oL.students from economically disadvantaged homes is even more 
pronounced ft>r the category of mild mental ret^it^ation regardless of racial/ethnic 
status. Minority -sjiatus and socioeconomic status ^re (unfortunately) not indepenc^'ent* 
The intriguing question^is whether minorities are ov^errepr^sented beyond the levelj. 
that might be predicted i^rom socioeconomic .status da£u. ^ \ ^ | 

Overrepresentation is\ a simplistic and often misunderstood notion of test bias. 
Nevertheless this definition continues to be used^l^X ^he courts (e.g., Mattie T. vs 
Holladaj, 1979) . If carried to its logical conclu,sion this definition could result 
in elimination of virtually all special services programs due to alleged race/ethnic, 
sex, or socioeconomic bias. This illogicaJL outcome is not in the best interests of 
children. 

The other def inition^of bias used by the courts is mean differences in scores 
among groups. This definition is discussed in a later section of this paper. 

Special education placement litigation ,has been a significant influence in recent 
years. ! Unfortunately, the courts by their nature are not a desirable mechanism for 
resolving disputes in the behavioral sciences. ^ tn contrast to the behavioral sciences 
and professions, the fundamental purposes and methods of resolving issues are quite 
different in the courts. The legal system in the placement litigation is concerned 
with abstract principles of justice, particularly as they apply to groups of persons. 
The sciences are devoted to "truth" which is recognized as b^ng tentative and approx- 
imate. The perspectives of ^of essional personnel 3ucH as special educators and school 
psychologists are typically focused on the individu^ whp is having significant learn-- 
Ing or behavioral problems in the classroom. The exf)^licit and implicit issues in the^ 
litigation are at best ambiguous. None of the issues ^ can be resolved unequivocally 
through the sciennific method of theory, ^research, ani analysis of data.. The available 
evidence is at the level of probability statements which would justify -decisions using 
language such as "might" or "should." The professionals involved appear to operate in 
a manner consistent with this evidence. For example not all children with low IQ scores 
are placed in special programs, and a few with IQ scores above cut off^ criteria are 
placed. These decisions are based on a cortiprehensive view of the individual aJ\d the 
best estimates of what is best for that individual. The overrepresentation that has 
resulted has been the culmination of decisions about individuals, not decisions about 
groups. The status of groups of persons has-been an important area of judicial inquiry 
which has expanded since the Brown decision in 1954. However, the courts'by their na- 
ture reach decisions which pertain to groups and are stated in decisive, unequivocal 
language such as "shall" or "must." The court remedies are therefore rarely consistent 
with the scientific evidence, or the approach of professionals. 
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JLegislation ^ , * • ^ 

The litigation of the late 1960s* and early lL970s was an important source of in- 
fluence on State and Federal legislation in the mid 197''s. The appendices to this 
paper include the Protection in Evaluation Procedures Provisions (Sections 121a. 530- 
l2la. -^34, Federal Register, 1977) of PL 94-^142, These reqjuir^fnents are particularly 
relevant t:o the challenge of providing appror 'ate assessment services for all stu- 
den^ts/' and should be iread careiCully by uU special educe^tion personnel. Two pro- 
ri^lons i|re p#-trticularly important. 

^ "Testing and evaluation materials and procedures used for' ' . 

.the purposes of evaluation and -placement of handicapped 
children r.ust be selected and administered so as not to 
'be racially or culturally discriminatory." Section 121a, 

"■53aj:b). FederaOleglster, 1977,' ' ' 7 

"In inturpretting Waluatlon data and in. makxng placement 
decf^-ions, each public agency 3hall: Draw upon -informa- 
£ion from a 'Variety of sources, • Including aptitude and 
V achievement tests, teacher recomraendatfons, physical con- i 
\ ditton, social or cultural background, and adaptive be- ^ 
Tiavior; Insure that information obtained from all of these ^ 
sources Is documentec' and carefully considered." Section 

121a. 533(a, 1 & 2) , Federal Register, 1977. . ' , 

".The requirement that assessment be nondiscriminatory is deceptively simple. The 
langJa^e is unequivocal, but no definition is provided and no' criteria are available 
in tKe legislation or rules and regulations concerning Jmplementation of the regulation 
The apparent solution was to quire that a broad variety of information be considered 
including social or cultural backaruund and adaptive behavior*. The meaning, measure- 
ment, and -use of these concepts are also far from clear, • 

« >• 
CONCEPTS OF BIAS IN TESTS AND RESEARCH WITH THE WISC-R 

• Much^of the special education placement litrigation,as well as othei* discussions 
of over representation have assumed that conventional tests are biased against minority 
students* Careful examination of the educational and psychological 11 .rature reveals 
a different picture. There are many definitions of bias, a variety of ways to analyze 
the.data^ and widely varying conclusions reflected in this literature. Surprisingly, 
some of ^ the widely held assumptions about4:ommon tests simply are not supported by em- 
pirj^cal evidence* 

The concept of te^Jt bias has been defined in many different ways in the recent 
literature* In what is perjiaps the^Vl;^ comprehensive discussion of different defi- 
-nitions Plaugher (1978) identified .eight separate concepts of bias in testier Other^ 
recent examinations of test bias have analyzedthe different values which underlie ^ 
varying positions (Hunter and Schmidt, 1976)^ the.different' procedures for enhancing 
fairness vs. social oquity in selection (Petersen ai!d ijoviclc, 1976); and the different 
outcomes of empirical examination of test bias depending on .the definition and cri- 
teria u^ed (Reschly, 1981). Others, e.g*, Ysseldyke (I979), have^r^s^d factors 
such as naturally occuring pupil characteristics (e.g,, physical attractiveness) which 
bias decisions before and after formal assessment, activities are conducted. 

Close examination of s^ome of the recent definitions of bias will reveal both the 
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varying conceptions and criteria: va$ well as some coimnon features* Flaugher (1978) 
identified the following d-eflnitiotis of test bias as mean differences, overinter- 
pretation, sexism, differential validity, content, selection model, wrong cri- 
terion^ and atmosphert (referring to kituational factors in examiner characteristics, 
examiner-examinee interaction, etc*)* Jones (1978) suggested that test bias might 
exist at the content level in the selection of items; in the standardization where 
decisions are made concerning che population for whom the test is appropriate; in 
the administration .of the test where the examit.er m- r be unfamiliar with thfi culture 
of the child; and in the validation where x:esearch may not be conducted concerning* 
test validity for culturally different persons. Mercer (1978) suggested that the fol- 
lowing five lines of evidence establish the existence of bias in tests: Test items, 
from a single cultural heritage; DiJferences in average acores among different racial 
and cultural groups; Sociocultural differences within and between cultural and rac^.al 
groups with these difference^ accounting for a significant proportion of the variance 
in test performance; E^jperim^ntal studies demonstrating the effects on test performance 
of early Interventions^ w^th culturally different children; and The effects of adoption 
of^ minority children into core culture homes. 

/ 

It .is interesting to note the comraonalities in .the definitions proposed by these 
diverse authors who represent different disciplines, cultural groups, and perspectives. 
The criterion of item or content bias is mentioned by all with at least two of*^f?he.^^^ 
authors agreeing on the criteria of average score differences (Flaugher and Mercer^', 
administration or atmosphere effects (Flaugher and Jones) , and differential validity 
(Flaugher and Jones). Similar concerns appear to be expressed, .although in different 
wayg, by Flaugher and Jones regarding misinterpretation and the appropriateness of 
' the criterion in validity studies. 

An obvioi question is whi9h definition of test bias is correct? What criteripn 
should we use in the examination commonly used tests? The answer,, perhaps unfortu- 
nately, is far from simple. Flaugher (1978) argnes that all of the definitions are 
"right" in the senfee that test bias is a public concern, i.e., not restricted to an 
academic discipline, and significant numbers of citizens have legitimate interests 
and concerns in the definite n used. 

I 

Table 1 provides a lit>. . the common definitions of test bias, and a summary of 
results from many studies.^ On most criteria, conventional intelligence tests are riot 
found to be biased. How.ev^r, the social , consequences of test use have often been neg- 
ative, an issue to which we shall return in ^t his paper. 



Table 1 



Summary o'f Concepts and Empirical/ Studies of Bias in Tests 



Deflnitis;n of 
Bias ^'^X 



1. MEAN DIFFERENCES 



2. ITEM BIAS 



ERIC 



Empirical 
Studies 

Large number of 
studies* 



Several recent 
studies using the 
W-iSe-Ri— Mahy-- - 



studies with^roup 
tests* 



7? 



-Results 

Confirmed/Equivocal/Not Supported 

;Economically disadvantaged minor- 
ity students obtain lower average 
scores. The size of the differ- 
ences vary by group and/or for 
Qome groups, by type of measure. 



Subjective judgments usually iden- 
tify many items as biased. How- 
Bver-, subject ive-judgmeRrt&-are™-un=u 
reliable. Empirical studies gen- 
erally do not support the existence 
of item bias on conventional testSi. 



Definition of 
Bias 

3, PSYCHOMETRIC 



Emf rical — 
Studies 

Several recent 
studies* 



Reschly^ 

Results • , 

C oQf jlrmed/S^j.voqal/NQt Siippogted 

-Psychometric characteristics such 
as reliability, item x total, sub- 
test X scale, etc* are the same 
regardless o^ group* - ^ 



4, FACTOR ANALYSIS 



5. ATMOSPHERE BIAS ' 



6. PREDICTIVE VALIDITY *- 
TE3TS -0F ACHIEVEMENT 



7. PREDICTIVE VALIDITY 
TEACHEIT RATINGS/ 
GRADES 



Several recent 
studies* 



The factor structure on tests such 
as the WISC-R is largely the same ^ 
regardles s of group* 



Many studies* 

V 



Many studies. 



Inconsistent results, often con- 
tradictory. Thd size of the ef- 
fec ts; if real, is small* 

relationship between ability 



Few studies* 



^nd achievement tests is virtually 
the same Regardless of group. Is- 
sue of "autocorrelat^-on" is un~ 
resolv ed. \ f 

Inconsistent results , apparently-, 
due tq^type of criterion measure* 



8* SOCIAL CONSEQUENCES 
Misuse, misinter- 
pretation, over- 
interpretation 



Few published 
studies, consid- 
erable anecdotal 
and historical 
evidence* 



Conventional tests are_frequently 
overinterpreted and/or misinter- 
preted* Test results have been 
used to justify restrictive and s 
sometimes racist social policies* 



SELECTION RATIOS 



Many "indirect" 
studies* 



Economically disadvantaged, min- j[ ' 
ority students are overrepresented 
in special education programs for 
the mildly retarded. Tests are 
used as part o£ that process* ' 
Whether test use increases OR de-j 
creases the overrepresentation is| 
unclear* 



• CONSTRUCT VA LIDin /C ONTENT BIAS ^ 

'Perhaps the most commonly used definition of test bias is the assertion that the 
WISC-R and other conventional standardized tests measure a different attribute when 
usedf^with non-Ajjglo persons^. This assertion amounts to a criticism that xhe construct 
validity of the test Is not the same for all groups. If the test measures .different 
attributes and the items function differently depending on group membership, then the 
meaning and usefulness of the test results probably are diminished* Moreover, the 
mean differences between groups aire then attributed to inappropriateness of the test 
items,^ and other explanations such as economic disadvantage are rejected* Thus, the 
constpict validity/content bias conception has broad Implications for examination of ^ 
bias in specific instruments such .as the WISC-R. A number of different criteria hay#^ 
been suggested for examlnatioa of construct validity/content bias* Data from studies 
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using the WISC-R will hi discus.^ed in relation to fpur crfteria: mean difference^,- . ^ 
item bias, psychr:netric characteristics, and factor analysis. / S 

Mean Dlf f er enc es . - ' * • 

That dif f erent. sociocultural groups obtain higher or lower scores on ^he average^ 
on- standardized ability antf achievement tests, is one of the r^st ^^^^ 
oldest observations in the history of psychological measurement A ^f^^^^y °Lf 
have been made to eliminate thes^' dif f erences through changing the .°^^,^^^*^^ff * 

All such efforts to produce cuUure-free or culture fair tests have . ^ . 

other reason than the fact that the concept of culture free or fair ^^^^^^f "^^J^ '^^^f 
Anastasi (1976) points out that the entire notion of assessment ^J^f ^^"^^.^"J^^/j"! 
the beginning steps of specifying what to measure to the ^^"^^^'^^P^ ^"^he viry ' 

lidity data on the relationship of . test results to some criterion behavioi. The very 
practical question is, how would we use a culture free test even if one could be de 
veloped? / ' ■ 



- -^'Although eliminatinc mean differences through revision of the tests s^^^^^^^" 
■tually impoLible. knowledge of the nature of the mean differences is i"'P°'^^^"^/;- ^_ 
forma^ion'for test use and interpretation of .est results. Mean ^^^f J/'^' 
ations in patterns of performance among groups are not ^^^^Pj^ '^"/^3p°^ted 
socioeconomic status (SES) . For example, Lesser. Fifer, ^"'^ -Cl^^^^^^^^^eS) reported 
that most -of the Siffarences in level 'of performance among groups 

by SES. However, sociocultural group had.a significant influence on the -Pattern of . 
performance independent of SES. As we shall see this finding appears to hold -true . 
for the performance of different groups on the' WISC-R. 

In Tab:^-2-*he-means for dif f erent groups on the WISC-R are reported from data 
obtained fro'm four studies of random samples. 



From the somewhat limited and controversial Perspective "^^^ 
differences among- groups , the WISC-R uould be tegarded biased against Black and^ 
Hisoanic groups In All studies which included Blacks and Hispanics, both groups o« 
tailed ■si|nS?c;ntly lower' scores on the WISC-R. ^ome variation^ in pattern of per- 
formance is also apparent in those data. Hispanic students obtained ^^g^i^ 
"hl^ scores on the Performance Scale than, on the Verbal- Scale. It might belted , . 
however, that the Performance Scale scores generally were ^till below the population 
aveSge an^ that not all non-Anglo groups score higher on the P^^^"™^?^^^^"^^;^^^ -i- 
Sutiou^-USe of these results in generalizations to groups from other yegions clearly ^ 

is necessary. . ^ _ / • f / 

Data on the existence of mean dif ferences do not of course provi</e any i"f°^^»=if 
on caulation. However, average scores below the^population ^J^f 4^ 
non-Anglo groups. Identifiable groups of White Anglo Saxon P'^o^f ^^^/J^°^^^°^^^^7 
"w pSpula^ion^aeans; e.g., Appalachian Whites . This fact should ^-^^^"i^/"^;^^^ J' 
any- ?eLi-ning skeptics that it is not race or ethnicity per se ^^^^^^^^^J^^f 
of the differences among groups. In the recent literature, Pf "^^^;^P(1977) 

tions have been proposed to ac.our.t for the differences among groui^q. Trotroan KX^u) 
provides a good ove^iew ot the explanations of cultural dif ferences cultural d s- 
advlntage, ihd genetic inferiority. Analysis of the logic and data. for ^^ch of these 
exnlSons has consumed, enor^ious amounts of time and space in the psychological lit 
eSture Each of the explanations has obvious implications for i^^^^^P^'^f °^ 
^.WISC-R and for social policy. For example, the cultural '^"^--"^^g^^-P^^S^e L t 
^ the vacations in group means would suggest that the content J^JJ^i"^^)""^!. 
reflect what Is regarded a^ in::elligent behavio;- in non-^nglo (non-middle oxass; qui 
^uresj ?he data which follows on i?em bias are at least' partly relevant to this point 
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Table 2 

Mean WISC-Tl Scores for Different Soclocultural Groups 



Group 
Sample Scale^ 



Anglo 
V' ? FS 



Black 



1. Kaufman & Doppelt 

(1976) Standard- -102 102 102 
Izatlon Sample (N=l|70) 



2. Mercer (1979) 
SOMPA Standard- 
ization Sample 
(CA) 

3. . Reschly (1978) 

Pima: Co. <AZ) 



4. 'Reschly &"Ross- , 
Reynolds. 1980 
Iowa Assessment 
Project 



102 104 103 
(N=604) 



101 102 101 
(N=252) 



108 110 110 
(N=100) . 



88 87 86 
(N=305) 



89 90 88 
(N=456) 



86 ' 89 86 
(N=235) 



96 96 95 
(N=100) 



Hispanic 
V .P FS 

Not available 



88 98 92 
(N=52a) 



85 93 88 
(N=223) 



^ = WISC-R Verbal Scale IQ Score 

P = WJSC-R Performance Scale IQ Score- 
— F S = W lS €^-R-Fqi:i-Sx:al-e-IQ Score- 



of view. The social policy implications of the cultural difference view might be to 
eliminate conventional cests, to develop culturally specific tes^.CWilliams, 1971), 
or to correct the bias in current tests (Mercer, 1979). The cultural disadvantage 
vi.ew would elhphasize the inadequate' stimulation for intellectual development In lower 
social* class homes. The solution to differences in group means from this perspective 
would be to provide intensive early interventions (Garber, 1975) and compensatory 
education orograms. The current tests such as the WISC-R, however, are accepted as 
valid indicators of intellectual ability (scholastic aptitude). In the past decade 
the debate over the I third explanation, genetic inferiority (Jensen, 1969), has gen- 
erated enormous' controversy. The data available currt tly, and the. kind of data that 
^n be generated, provides an inadequate basis for resolutl,'on of the question of her- 
"editary differences among groups. A complete review of the data on the nature-nurture 
issue is far-bevond the scope of this paper. The interested reader is encouraged to 



examine Brody and Brody (1976)., Jensen (1973), Loehlin, Lindzey, and Spuhler (1975), 
and Samuda (1975). Perhaps the most objectionable feature of some versions of the 
heredJtarian position is the recommendation of changes in social policy as a result of 
data which are- at best tentative. The sense of outra^ among minority psychologists 
and tfie efforts to ban tests can perhaps be understood .±f we are aware of some of the 
extreme hereditarian views, e.g., Shockley (1971). Other implications of the hered- 
itarian view are to place less emphasis on governmental 9upport of early intervention 
and compensatory education program?, and perhaps unintentionally, but implicitly, 
more emphasis on interpretation of IQ scores as reflecting the genetic endowment of 
■ the4adividuaL^.The:-pEQperUnterpE,et:ati.otLj?f_IQ_t.es^^^^^ in a ' 

later section. 
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Item Bias > 

Allegations .of cultural bias in the items used on conventional tests -have been 
and continue to be the most popular of the criticisms of standardized tests. ^ In fact 
examination of an item fxrom a current standardized test to support the allegation of 
bias in all of the items appears to be an increasingly popular indoor sport. Examples 
of subjective ju|graents of item bias are numerous (e.g., APA Monitor, 1977; Dent, 1976; 
Williams, 1971) . \ The implicit assumption is that all items orr the test are biased if' 
one or a few of tae items are apparently biased. If the test is presumed to be biased 
on the basis of inappropriate items, then the test results are presunjed to be "inac- 
curate" and unfaijB. If the items are biased., usually meaning that opportunity to learn 
the content of th^ltem is not common to all environments, then the test results cer-/ 
tainly^ do not reflect, and cannot be interpreted as evidence of "innate" intelligence. 
HoHevef\the IQ test results are, not direct measures of innate ability for\ an> group. 

\ ■ ^ ' ' • ^ ^ 

The distinction betweea cultural bias and cultural loa<Hng is important to this 

discussion. ^ The degree of cultural loading of an item, t^at is, the likelihood of 
success on the item for persons with. different backgrounds and experiences, varies 
on a continuum. At one end of the continuum are items that could only be answered 
correctly by persons with highly specific backgrounds and . experiences. An example 
might be anv. item that asks "Name three presidents of Iowa State University over the 
past century" X the present author can name only two). The item is similar to those 
on many intelligence tests in terms of the type of thinking required.. However, only 
a very limited sample of persons would have an opportunity to be exposed to this infor--^ 
roation and thereby answer the item correctly. The item reflects a very high degree of 
cultural loading and would be regarded by most as culturally biased (as well as trivial) 
Some items on current standardised tests requir.: similar kinds of thought paxtems and 
also vary in degree of cultural loading. Another item parallel to the above example , 
would^e "Name two presidents of the United States since I960." This item is certainly 
lower in cultural goading becaus^^ the information required is more general, and many 
more persons^would have an opportunity to learn ~tHe corlrect Irespohses", ~ hwever , some 
persons might still judge this item as culturally biased since the opportunity to learn 
the information might vary among different groups. The question of bias in an item 
should be determined by empirical analyfds of responses from persons representing dif- 
ferent groups,' not by judgment alone. 

The degree of cultural loading' of an item depends on the generality of the inf or- 
roation and the characteristics of the perspns taking the test. These points are il- 
lustrated well in the development of "counterbalanced" or culturally specific intel- 
ligende tests (e.g., Dove^, undated; Williams, 1975). These tests require highly spe- 
cific information that is usually possessed only by persons with particular backgrounds 
or experiences. In Table 3 examples of culturally specific items are provided. 

The evidence on item* bias has been produced through two markedly different methods 
of examining test jttems. The most Common method has been subjective judgments of^iteth 
content. A less common method has been empivical analysis of the item responses of 
examinees from different racial or ethnic groups. The results of "tests" of item bias 
vary dramatically depending on which method is used. 

Subjective judgraerit as .a method usually involves obtaining opinions- from expert 
representatives of the minority culture regarding whether or not the items in ^a test 
are biased against examinees f|:om that culture. Two Verbals Scale items on the WISC-R 
shave been cited frequently as biased. The Information subtest item, **Who discovered 
America?" and the Comprehension subtest iteju 'What is the thing to do if a boy (girl) 



13 ■ ■ - Reschly 

Table 3" - 

^ Sample Items From Culture-Specific Tests 

/ Dove Ccmnterb^lanced ^ Intern 

(Urban Black Culture) 

Bone Walker" go^ famous for playing what? (a),JC;rpmbone (b) Piano — 

(c) T-Flute ' (d) Guitar (3) "Hambone". 

A "Gas Head" is a person who has a . (a) Fast moving car (b) is table of 

"lace" (c) "Pr^ess" (d) Habit of stealing cars (e) Long jail ^ 
record for arson. 

If you throw the dic^e and -"7" is showing on the- top, what is facing down? 
(a) "Seven"' (b) Snake eyes (c) 'boxcars" (d) Littl^ Joes * 
(e) "Eleven". ' . ' ^ 

Cheap "Chitlins" (not the kind you purchase at^a frozen-food counter^ will 
taste rubbery unless they are cookerfi long enougli. How soon can you quit cooking 
them to eat and enjoy them? (a) 15 tninutes (b) j 12 hours (c) 24 hdurs 

(d) 1 week (on a low flame) (e) 1 ho^r. ^ 

"Jet" is . (a) An "East Oakland" motorcycle club (b) 'One of the gangs 

in West Side Story (c) A news and gossip magazine (d) A way or life iqr 
the very rich. _ - ^ ' ' 

. Counterbalan ced I ntelligence Test* (Source Unknown) 
lUrban Hispanic, Southwest) ^ 

^The^ame^^iMesus" in; partirular sjeems^Q-jlls.turtL-te^chet,s_a^^ 

changed to • 

The Spanish Language spoken in th^ southwestern states is known by Mexicans 



as 



Who was considered the Mexican Kobin Hood of California? 

Coraplete the following rhyme? 

Pancho Villa ; j 
mato su tia . 
con una tortilla 



The first Chlcano to, have a big hit record was the person who saing Donpa what 
was his name? ' 

Fry_Bread IQ Tegt (Deer. 1980) 
(Ameri^:airindiair Intelligence Test^ * 

Social dancing and singing held after hours is called: 

(a) Indian two step 0?)/ Pow4?ow (c) Forty-nine Cd) Indian Rock 

^ The Annual American Indian Fair and Exposition JLs held at: ' 
( a) ^ Crow Agency, Montana (b) Gallup, "New Mexico (c) "Ariadarko, OklahomA 

(d) Pine Rldgfe,. South Dakota , . ' 
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3. ' the largest of the American Indian tribes is the; 
■ (a) Navajo (b) Sioux (c) Cherokee (d) Creek 

4. A food^'staple traditional to many Indian Tribes is: 
(a) Buffalo - (b) Fry Bread (c) Commodity Cheese (d) Indian Round Steak 
(Bologna) 

5. Who said "The only good Indiarf is a dead Indian?" 

(a) Col. George Custer (b) Gen. Andrew Jackson (c) Gen. John Pershing 
(d) Gen. Phil Sheridan 

much smaller than yourself starts to fight with you?" are most often cited as biased 
against Ghicano or Native American and B lack examinees respectively. There are two 
majcr pvcLleras with^he subjective judgment method of determining item bias. First, 
the inter-judge agreement among experts representing the minority cultures is usually 
qiiite low, (Sandoval & Millie, 1980)* Second, and most Important,,, the results of em- 
pirical analysis do not confirm the subjective Judgments. 

Emp^ical analyses of item bias on a variety of tests have generally yielded 
equivocal or, negative results regarding hypotheses of item bias (Sandoval, lf^^9).^As 
Flaugher (1978) pointed out, if the phenomenon of item bias is real on conventionfel 
tasts^ it certainly does not account for a very large portion of the group differences. 
Elimination of biased items and rescoring the tests does not lead to significantly dif- 
ferent results in the research published thus far. 

# 

The evidence, though certainly no% definitive at this point, fails to support 
item bias as a significant explanation for the differences in mean scores among groups. 
Test items do vary -in amount, of cultural loading. Items on current tests are cultural- 
ly loaded to varying degrees, as they must be if tests are to predict or evaluate impor- 
tant behaviors that occur only within a cultural context. Subjective judgments of itciin 
— biae- are n o^-necessariXy ac^u^^-y— and^evlBlon oi- current t-e&t-S"-eit4^er -in t4^a-dlreo-~ 
tion of greater or lesser cultural loading might have the undesirable effects of si- 
^Itaneously increasing or maiataining group differences and reducing validity. 

Psyfihpme tr 1 c :Oharac t er is t ics . 

\ A large number pf possible studies could be conducted on the internal psychometric 
characteristics of the WISC-R when used with different groups. Some of the possible 
analyses of interest would be comparisons across groups of Internal consistency relia- 
bilitiv, subtest intercorrelations, subtest correlations with Verbal, Performance, and 
•Full Scale IQ scores, test-retest reliability, and intercorrelations of the Verbal, 
Performance, and Full Scale IQ scores. To date, very few such studies have been re- 
portedV , ^ " I ' ' 

Sandoval (1979) examined the ir^ernal consistency reliability (Cronbach Alpha) 6f 
the WISG-R subtests and IQ scales using the SOMPA standardization/ data. The relia- 
bility of the subtests and IQ scales was high and tlearly the same for Anglos, Blacks 
and Chlc^nos with tba exception of Object Assembly which was mor6 reliable for Blacks 
(.95) than for Anglos and Chicanos (.79' and .75, respectively). 'aU other differences 
in reliabi^Lity on WISC-R subtests were negligible with no systematic pattern of group 
differences. The reliabilities of the Verbal Scale and Performance Scale IQs were 
virtually identical for thethree groups (rounded to .97 and .94 respectively). In 
the only pther study pertinept to the issue ofVlSC-R reliability located by the author. 
Dean (1977) reported data o^the reliability of the WISC-R from a samplp of Ghicano stu- 
^ den ts referred f or_gsy chologi cal ey^^lua t ions In the Phoenix (AZ) area. T he rellablli- 
ties of the subtests and IQ scales"" were comparable to the data reported in the WISC-R 
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manual for. the age group included in the study (11.5 year olds). 
Factor Ana3ysis 

Comparability of factor analytic results for different groups, and 'the degree to 
which the results of the factor analysis are consistent with the major scores and com- 
mon interpretations of the test are necessary conditions for fairness in use of the 

test with culturally diyerse_persons.* Indeed, if a test is__not__Tneasurihg the same 

underlying -abilities or" if the commonly used scores from the test represent varying 
abilities depending on group member ship, /then use of the test with culturally different 
persoi^I is probably inappropriate and upfair, and the predictive validity of the test 
is likely to be lower for specific groups • 

The appropriate number of factors that should be interpreted was a somewhat con- 
troversial issue in r&search on theyiSC. The careful analyses- of the standardization 
data conducted first by Kaufman (a97'5) and then V Silverstein (1977), seem to have re- 
solved this problem. Both authors ^reported three factors; Verbal ComEjrehension (VC) 
formed by four/ Verbal Scale subtests, Perceptual Organization (PO) formed by five Per- 
formance Scale subtests, and a third factor labeled tentatively as Freedom from Distrac- 
tibility (FD) formed by a combination of three subtests from the Lwo scales. The reader 
is referred to several sources foi^ a more thorough discussion of the use of the factor 
analysis results (Kaufman, 1975; 1979a; 1979b; and Reschly & Reschly, 1979). 

Reschly (1978) investigated theWISC'-R factor structure using data from fou.r soclo- 
cultural groups in Pima Co*, Arizona. The methodology used was a rfiplication of 
Kaufman* St 1975 analysis of .the_ standardization data. The major questions addressed in 
this stjul^were: The appropriate number of factors for the four groups. The compara- 
bility "of the factors. The relationship of the factors to the IQ scales, and The evi- 
denc^T for a similar gene^^^al factor among the groups. 

The r>bj<=>f-t',1vp giiig^es ,to the appropri ate number of factors to interpret yielded In- 

consistent results. Three factors were indicated for Anglos, two or three factors for 
dliic^nos depending o^?iM:he criterion used, and only two factors for Blacks and Native 
Amar-Lcan Papagos. In view of the contradictory evidence, both two and three factor 
solutions were analyzed for ^il four groups* 

The two factor solutions were highly similar for all groups. The first and second 
factors^for all groups conformed almost. perfectly to the organization of the WISC-R in- 
to Verbal and Performance Scales. For all groups the Vocabulary (V), Informati^a (1), 
Similarities (S) , and Comprehension CC) subtests were the best measures of the first 
factor as wete Object Assembly (OA) and Black Design (BD) for thfes second factor, Goef- 
f icitints of .congruence reflecting the similarity of the two f actot solutions across tae 
four groups were very high (.97 to .99). 

As might have been anticipated from the preliminary data^ on appropriate number of 
factors, the three factur solutions varied 'significantly for the four groups. The pat- 
terns for Anglo and Chicano groups were neax-ly identical to. the data reported for. the 
standardization sample. The three factor solutions yielded an uninterpretable ^Hlrd 
factor for Native American Papagos, and a splitting of the major Performance Scale sub- 
tejsts into two factors for Blacks. In the thre6 factor solutions'" the coefficients of 
congruence were very high scross all groups for the first factor, high for the second 
factor,^ and high and comparable only for Anglos arid Chicanes on the third factor. 

^ final series of analyses with the Pima County data were conducted around the 
question of evidence for a general factor on the WISC-R for the diverse groiips. Tte'^e 
methods of estimating the amount of variance attributable to a" general factor yielded' 
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results pointing to the conclusion that theWISC-R Full Scale IQ score reflects the 
same attribute regardless of group membership. The fluctuations between groups in 
/amount of variance attributable to a general factor were minor. 

Differences between groups in the factor analysis of the Pimri County data were 
found only in relation to the nature. and 'composition of the third factor. The mean- 
ing of this factor, which accounts for a relatively small proportion of the WISC-R 
variance,' lias never been entirely c^ear. The other evidence from this study clearly 
supports the construct validity of the WISC-R with non-Anglo groups. Nearly identical 
two factor solutions which conform closely to the organization ,of the scales were 
found. A large general factor was clearly apparent in about the same form and magni- 
tude for all groups. Thus, the usual interpretation of the Full Scale IQ as an index 
of general intelligence (scholastic aptitude) and the Verbal-Performance scale dis- 
tinction appear to be equally appropriate for Anglo and non-Anglo groups. 

Summary: Oan struQt Validity /Content Bias . In this section, data on four differ- 
ent methods for determining conljtruct validity/content bias in the WISC-R' were analyzed.^ 
Although the mean differences criterion is somewhat inconsistent with the other crite- 
ria discussed in this section, it was included here because mean differences are f re- 
quently explained by allegations of content bias or assertions that the WISC-R measures' 
a ^different attribxite in Anglo and non-Anglo groups. The mean differences criterion 
raises troublesome questions because it seems to lead directly to prejudging and rul- 
ing out the reai:ty of differences among groups (Thorndike, 1971).^ Mean differences 
as sucH provide only weak evidence on test bias, and have no bearing on the question ' 
of bias iji test use. Nevertheless, this criterion was discussed here since it is even 
less consistent with the other general concepMons of test bias around vhich much of 
this chapter is organized. With these limitations in mind, it is appropriate to con-^ 
elude that some people would regard the WISC-R as biased since mean differences among 
. groups do exist. However, the hypotheses that the mean differences are cuased by item 
bias ^ that the test measures a different attribute in a different fashion depending 

" oir^roup meifib6rBhlp~at^~slmply m data. -Whartever iir ±s that the ^'MS€*-R - 

measures, a question to be discussecj later, it appears that the same attribute is mea- 
sured in the same way regardless of group membership. 

ATMOSPHERE BIAS 

I ' ^ 

'In addition to bias in content, another frequent criticism of standardized tests 
is that the atmosphere of the testing situation is unfair to minority children. Two 
general aspects of the testing environment are mentioned most frequently as possible 
sources of unfairness: (1) The kinds of responses and nature of the effort required 
on the test or (2) The nature of the interaction with the examiner may be incbnsistent 
with. the child's background or experiences* < 

/ 

A great amount of research has been conducted on atmosphere bias, and isT well re- 
viewed by Sattler (1970, 1973, and 1974). .The interested reader^ is encouraged to pur- 
si^e further information in those sources. The major conclusions from this research 
are the following: . , \ 

1) Much of ^the research was poorly designed. * 

2) Some of the studies used experimental manipulations that are atypical and 
inconsistent with good testing practices. For example, token reinforcers 
provided for correc^ answers* , \ 

3) --^The.. results ^of>> reasonably^ well-controlled studies In. which_the ,yariabJ.es__ _ 
manipulated were within the range of good testing .practices are contra- 
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dlctory. For example, the degreeiof warmth, amount of encouragement, 
time devoted to establishing rapport prior to testing, and sex or race 
of examinee, have been studied with mixed results. Inconsistency is ^ 
the rule rather than the exception in studies of examiner effects. 
' . ' / V, ^ ^ 

4) Examiner expectancies for performance may influence scoring of responses 
on items where there is some subjectivity in evaluating responses, for 
example. Vocabulary subtest of the Wechsler scales. 

5) When differences due to atmosphere effects are reported, the size of the 
' differences Is usually fairly small. 

6) If the phenomenon pi atmosphere effects is real, it is doubtful that it 
accoiints for very inuch of the differences among groups on standalrdized 
tests (Flaugher, 1978). 

Although the research/ on atmosphere effects does not support the existence of 
this sort of bias for groups , "thes^ results do not necessarily generalize to all nat- 
ural settings or to the^performance of all individuals. It is essential to recogniie ^ 
the basic assumption of maximum effort on ability, achievement, and aptitUde tests. 
If the child cannot or' does not perform as yell as possible due to unique features of 
the testiirg environment, the results of the test are inaccurate reflections of the 
child's thinking competencies or academic skills. In such cas^, comparisons of the 
child's performance to that of the normative sample are inappropriate. | 

Professional personnel who administer tests to culturally different persons must | 
be sensitive to individual variations in values, motivation, 'language, and, cognitive 
style, all of which could influence the results of the test. One of the most impor- 
tant roles of the examiner in individual evaluations is- to establish the kin(| of cli- 
mate that will produce the child's maximum effort and performance. In order to be ef- 
fective in this role the examiner needs to understand and appreciate the culture of 
the child being ^ssessedT "^Miribftssl inf ^riSsriotrirsgaTdlng Important^ considerations — 
in assessment of non-Anglo children is provided by Sattler (1974), Hynd and Garcia 
(1979) ^'or Native Americans, by Bartel'i Grill, and Bryen Cl?73)'for Blacks j and by 
Matluck and Mace (1973) foi Chicanos. 

BIAS IN TEST USE; PREDICTIVE VALIDITY \ 

A fairly common assertion is, that conventional standardized tests such as the ' 
WISC-R. provide low estimates of the competencies of minority group examinees. If this 
assertion is correctr then bias-^or discrimination may result from use of the test in 
predictions of performance on various cfiterla .(Deutsch,. Fishman, Kogan, North, & 
Whitman, 1964) . If the test is AeB3 valid for minority examinees or If the predictions j 
from the test vary as a function of group membership, then indeed test use is less ef- 
fective, and unfair or discriminatory as well if the prediction is too low. In this 
section evidence will be reviewed on the validity and predictive accuracy of the WISC- 
R when used with minorities. * ^ 

Several sources of inf oi^tlon are available on the general issue of t;he validity 
of indi.-Ldual and group intelligence tests. Sat tier (1974) ig a particularly good 
source of information for children's scales and Matarazzo C1972) provides an excellent 
review and discussion of th? relationship of ability measures to a variety of criteria. 
Although these sources of information are adequate to answer general questions con- 
cerning validity, they may not be sufficient to meet the PL 94-142 criterion of "vali- 
dated for the specific purpose for which they are used." This is ^^.specially true when 
different criteria are suggested or when conventional criteria are seen as inapproptiate 
foi^ assessing "the validity of tests. 
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Competing points of vi'eW regarding what criterion should be used in studies 
.'of the. validity of the WISC-R (and other intelligence tests), were expressed by 
'different witnesses in the' Larry P. court trial. The debate revolved aroUnd the 
. question of whether standardized tests of academic a'chievemerit are appropriate cr^- 

terla for assessing the degree to which the VTISC-R. predicts sclidol performance. / 
\Abundant data dp of course exist to substantiate the fairly strong, positive cor- 
relation between the WISC-R and ^tandized te^ts of achievement (typical correlations 
*rom studie. ate in the range of .5 to .7). Mercer contended that the aiJility and 
athievement tests were measuring the same thing; that the traditional distinction be- 
tween the tests was artificial; and that both ability and achievement tests share^ 
the same kinds of biases against minorities. Mercer then suggested that grades and 
teacher ratings of classroom performance were the only really independent-sources of 
information (i.e., Independent df IQ) regarding performance in the academic setting, 
and ho.nce, the only^ppropriate criteria to use in studies of the validity of the 
WISC-K. Others have, disputed this point of view (e.g., Clarizio, 1979a; 1979b). 

Regardless of which criterion is used to assess academic performance, predictive 
validity of he WJS'C-R for the criterion school achievement is clearly a necessary 
pre'-eq'uisite to fairness' in the use of the VTISC-R with minorities by school psycholo- 
gists. Fortunately, recent studies do appear to support the predictive validity of 7 
the WISC-R for both types- of criteria of academic performance. In Table 4 results / 
from several large sample studies are- presented. ^ 

. The results from several recent studies summarizrd in Table 4 support the validity 
of the KISC-R as a predictor of achievement for minority and majority groups. The mag- 
nitude of the correlations were about the same for all groups with the possible excep- 
tion of Native American Papagos where the relationships were generally lower. The cor- 
relations between the WISC-R Full Scale IQ and standardized achievement test results , 
were in the typical range of .5 to .7 for the three groups included in both studies. 
Goldman and Hartig (1976) published data on the relationship of theWISC to three mea- 

_su«es--o^f--faassr«om--peEformance_f£ir_large_s^mpJLes_Qf 
from Rivetside California TheWISC was administered in 1967 with the measures ol 
classroom performance apparently collected at varying times between 1967 and 1969. 
Teacher assigned grades over the next two years were collected and organized Into a 
composite for. Academic CPA. The Academic CPA was a rather strange amalgamation ot 

■ grades in academic and nonacademic subjects including, "music ,- health y art, reading, 
arlthraetrlc, math, social studies, acience, language, spelling, writing. I nstrumental 
music, ph ysical education , composition and karamar, history, geography,^ and foreign . 
liii^iage" (p. 585, eraph^is added) . The re'lationsMp of the WI^C to the Academic 
CPA" measure was relatively low for all grbups, but higher for Anglo^ (.25) than for - 
Chicanos and B lacks -(-.a2 and .14 respectively). MeJcer. (1979) reported similar re- 
suits for the same groups. Again, the mc^asure of "Academic CPA was a rather unusual 
combinattojL oJE^grades J.a academics and_nc/nacademic subjects. Other studies using 
teacher ratings- of academic performance^ revealed no' evidence of differential validity 
(Reschly & Reschly, 1979; Reschly & Roiss-Reynolds, 1980; Hartlage & Steele, 1977). 

Overall stuaies on the relationship between measures such as the WISC-R and ac- 
ademic performance are generally ppsitive. Clearly, the only evidence for lower or 
differential validity when the criterion for academic achievement is a standardized 
teat comes from one. sample of Native American students. For other groups, Anglos, 
Blacks, and ChicanosT the WISC-R predicted standardized achievement test performance 
"Equally well regardless of gro^ip membership. The data regarding the relationship of 
the WISC-R to teacher ratings or grades are less definitive fot a variety of reasons. 
There- is the sticky problem of the reliability (and vald,dity) of teacher ratings. 
Despiire the 4)bvious problems with- this criterion, there are datd to support the yalld- 
' ity of the w'iSC-R as-a predictor 6f teacher-eatings f or aiTf-gf ent Yaclal^^^^^ ethnic 
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TabjLe 4— 



1 

Corirelaticns of WISC-R FvibrScale Score with 
^tandaMi'Jed Tests of Actiievement and with 
Teacher Ratings/Grades 



— Achievement Measure 



Sample/Group j 

Pima Co., AZ 
(Reschly & Reschly, 
1979) 



Austin, TX 
(Oakland, 1977) 



Riverside,, CA 
(Goldman & Hartig, 
1976) 

vRiveiraide, CA 
(Mercer, 1979) 



Iowa Assessment 
Project (Reschly & 



Te8t_j 
Reading.^ 



A 
B 
- H 
NAP 

A 
6 

h; 

A 
B 
H 

A 
B 
H 

A 
B 



.56 
.62— 

.55- 
.Al_ 

.72 
.64 - 
.64 

NA 
NA 
NA 

NA 

NA- 
NA 



Test 
Math 



.5 
.51 
.50 
.43 



.64 
.61 
.59 

nA 

NA 
NA 

NA 
NA 
NA 



"Teacher 
Rating • 
or Grades 

, .35 
* .45 
.38 
.34 

NA 
NA 

\ NA 

\ 

.14 
.12 



iHot Analyzed 
Not Analyzed. 



.44 (Mdn.V 
.27 (Mdn.) 
.24 (Mdn.) 

.60 (Mdn,) 
.55 (Mdn.) 



Ross-Reynolds, In 
Press) 

Notes: 1) ' A, B, H, & NAP denote Anglo , B lack', Hispanic, and Native American 
Papago, respectively I 

2) The Metropolitan Achievement Test was used in the Pima County Study. 

3) The California Achievement Test- was used in th e Austin, JlX^tudy. 

groups. To return J:o our original questi ions at the beginning of this section, the 
presently available evidence does indicate that the VTISC-R is unbiased on the- cri 
terion of predictive validity. This conclusion, of course must be made somewhat 
conditional due to some variations in studies and insufficient evidence concerning 
all groups of potential interest. J . 

BIAS IN TEST USE; SOCIAL CONSEQUENCES 

The previous dejFir^tions of test bias7Alt;jiough Important, are inadequate in 
terras of the bveralf influence of tests upon the lives of persons. Testing does 
have social consequences. Tests, even those which- predict accur^ely, have been 
misused to justify race, social class, and _^hnic discrimination. Kamin (1974) 
and Cronbach (1975) provided ample evidence^ regarding. the misuse of tests to jus- 
tify racial and ethnic discrimination in the early decades of thlE century. What 
<was surprising to me was the rather frequent use of IQ test results to, justify 
■ ra<Sl4 segregation in public schools during, the 1960s. Bersoff 's (1979) review 
of tHe^lltigJftlon regarding these inractlces. demonstrated clearly that misuse of 
testcesuits to justify discrimination was: not slmpl^ an unfortunate event in 
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American psychology that.occured *a long time ago, but that such misuses have occured 
fairly recently. Further, the implications for and occasional recomendations re- 
garding social policy that are Justified today by citing group differences on IQ 
tests are potentially as discriminatory and abusive with regard to individual and 
group rights as anything done in the past (e.g., Schockley, 1971). IQ test results 
have sometimes led to a reduction of opportunities for persons'»and have qualified 
piersons for apparently ineffective interventions which -may have been stigmatizing 
and humiliating. At the same time, we would be remiss if we didn't emphasize that 
standardized test results have also been instrumental' in removing existing barriers ^ 
and in increasing opportunities for many minority persons. However, to defend tests 
simply on the basis of predictive accuracy is to miss entirely the points raised by 
rectnt-eritics of testg. - 0 

Jackson's (1975) response. to the report of the American Psychological Associa- 
tion Committee on Educational Uses of Tests (Cleary, Humphreys, Kendrick, and 
Wesman, 1975). is even more to the point. Jackson saw the report as largely irrel- 
evant to the concerns expressed by minorities. The Report defended the techn.ical 
adequacy of the tests when in fact the major concerns of Black and Chicano psycholo- 
gists (Bernal, 1975) are with how tests affect the liv^is of persons.. The fact that 
tests have been used by some to justify racist ideology, .and otherwise have been 
misused or misinterpreted in iftferences about the potential of individuals are facts 
acknowledged even by the authors of the APA report. Thus, to defend tests on the 
basis of evidence of common regression systems, or to attempt to separate the issues 
of technical adequacy from those of social consequences is insufficient for our pur- 
poses of attempting to enhance fairness in test use. , , ; 

Over interpretation. Misinterpretation, and Misuse of Test Results . 

Much of Mercer's recent work would" appear lo be directed quite properly toward 
eliminating the misinterpretation of IQ test results. The ^^sues that have become 
involved- iff the debate over SGMPA have occasionally led the discussion away from-thls 
verr^crucial effort. Mercer emphasizes that all current ability and aptitude tests - 
are measures of learning. There should be no disagreement over this point if we 
mereiy^conaider the content of tests and the cons.titutional repertoires of toman in- 
fants. It is true that many, perhaps most skills measured by test items do depend on 
certain maturational developments, but learning after the maturational readiness is 
achieved" is still necessary for mastery of the skills. Therefore, in a general sense 
IQ tests such as the WISC-R clearly are tests of learning. 

It is not difficult to locate numerous examples of overinterpretation of the 
WISC-R. For example, -use of the WISC-R subtest patterns, or differences between Verbal 
and Performance Scale IQs as the basis for a diagnosis of learning disabled, mild jnen- 
tal retardation or even emotional disturbance is all' too couimon. 'These diagnostic xn- 
ferences are part of longstanding tradition. (and folklore) in applied areas of psychol- 
ogy. Certain technical problems such as ixnreliability of difference scores and the 
dangers of making generalizations to individuals from studies of intact groups have 
been knQwn,^but not appreciated sufficiently for many years. Recent data on the base 
rates of subtest fluctuations and IQ scale differences should certainly reduce this 
sort of overinterpretation of the WISC-R. Kaufman a979aib) reviewed data from the 
WISC-R standardization sample which demonstrated unequivocally that subtest fluctua- 
tions and IQ scale differences are the rule, not the excep.tion for normal children. 
Continued use of the WI6C-R patterns to establish or even support a differential diag- 
,nosis is clearly indefensible. Readers interested in these data are referred to 
Kaufman'c very clear discussions of appropriate interpretatiop of the WISC-R, 

Unfortunately, the kind of overinterpretation described in the preceding pa^a- 
- graph probably is not the most serious -misuse of IQ test results. Results ^om intel- 
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licence test! such as ^eilSC-& are all too often believed to te fixed, unitary. . 
and predetermined by genetic factors. These myths are too prevalent among con- 
sumers of test re*sults, e.g., parents and teachers, and even perhaps among school 
psychologists, for us to ignore. Reactions to these myths which lead to misinter- 
pretation and misuse, of intelligence test results are among the most frequent con- • 
cerns expressed by critics of intellectual assessment with minorities. These myths ^ 
were also a major underlying concern^ in the placement litigation of the parly 1970s 
(Reschly, 1979). 

- The i^ths'that IQ' test results are fixed and that intelligence is unitary are 
f.'latively easy to refute. I know of no one in the field who argues that present 
IQ tests measure all or even a majority of the important capabilities and competenc- 
ies related to success and overall adaptation. .Certainly the authors of major tests 
such as David Wechsler recognize that even our very best instruments do not measure 

everything- of-importance^nd-tbat.^telligence_ls a mao^trfac eted, not a unitary . 

a-tribute- of the individual. "The fact that IQ scores are not fixed, i.e. , oo "ot 
stay constant, is readily apparent from careful examination of data rrom longitudinal 
studies (McCall, Appelbaum. and Hogarty, 1973) . It is true that scores on IQ tests 
are fairly stable after age .6 for groups of individuals. However, the IQ scores for . 
a significant percentage of individuals (at least 20 percent) change by 15 points or 
uiore between age 6 and maturity, and considerably larger changes of 30 or AO points 
have been reported for a few eases. When large changes do occur they tend to oe 
associated with significant cT\anges in the individual's environment or overall emo- 
tional adjustment. The fact that IQ tests do change as -a function of changes ^he 
Individual or the environment might be seen as evidence for lAcreasing our confi^nce 
in the test results as indicators of current intellectual functioning, probably je 
most common interpret -lou of IQ test results. We need to be conscipus of the ta^t 
and inform others that scores do change, and that inferences about tire- future intel- 
lectual status of the Individual are always tentative. 

The final myth, that IQ is predetermined by genetic factors, is a bit more com- 
plex. .As noted earlier the information or problem-solving skills required on IQ test 
items are learned. However, this fact does not preclude ±he influence of genetic fac- 
tors on test iores. Although nearly irrefutable data exi^t to P'^""'^, '^J^J^^f 
factors influence measured intelligence, tb' unanswered (and unanswerable) issues are 
the ^.mbunt of influence attributable to genetics and the genetic Influence on the score 
for an individual. Discussion of the fd^st "unanswerable" cuestion is far beyond the 
scope of this chapter. Consideration of the question of the, genetic influence on the 
score of an individual is a central issue in resolution of proble.3 of misinterpreta- 
tion of IQ test scores. 

Mer-er Ci979) provided an excellent summary of the precise condltic s that must 
be.mer in order to legitimately interpret the differences in -scores of in^^iyif 
(or groups) as reflecting different levels of innate potential. These conditions are: 
1) 'iqual exposure to opportunities Ic-arn the information or Problem solving skills 
measured by the test; 2) tqual le. els of inotivafioft to learn and '^^^"^'^"^^f ;f 
learning wLtever the test requires; 3) Equal familiarity with tests and test-taking 
situations; 4) Persons (or groups) being compared are equal on affective factots y 
such as anxiety, fear, and emotional turmoil which might iViterfere with learning or 
performance on the test; 5) Persons (or groups) being compared are equal physical, 
sensory, or motor abilities which might interfere with test Pf °VJ« nnrfSf 

ing. Meeting these criteria in any practical situation in which rheWlSC-R is part dt 
the assessment h^tevy for an individual is virtually impossible. I might add that 
meetin^^^e criteria or controlling their effects in research on groups is very rarely, 
if ev/et /i>ossib3e. 
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Those of us who conduct intellectual assessments with tests such as the WISC-R 
have a special responsibility to protect our clients from misinterpretation of test 
results. Several courses of action appear to be needed at the present. The myth 
that intelligence is unitary, and the only attribute of a student that we consider 
to be important in classification and programming, can be dispelled most effectively 
by carrying out the full multif actored assessment requirements of PL 94-142 (Tucker, 
1977; Reschly, 1979). Many of the PL 94-142 requirements, particularly the phrases 
"No single procedure is used as the sole criterion..." and "...to assess specific 
areas of educational need and not m-^ely those v/hich are designed to provide a single 
general intelligence quotient;" appear to be designed to alleviate past misuses of 
intelligence test data (Federal Register, 1977). We can argue about, and I believe 
refute, the notion that IQ tests were used as the single source of information in 
previous classification and placement decisions involving minorities. However, the 
documentation provided^ for classification^and placement decisions often appeared to 
place primary, ~ if not sole, Treiiance tttt fQ"tesL data. ~^Iiuplumen£ing and documeiitlug 
tne multif actored assessment require omenta should both dispel any remaining miscon- 
ceptions that we believe intelligen«'e to be unitary, as well as lead to better clas- 
sification and programming decisions. 

» 

Another desirable step in reducing misconceptions would be to change the name 
of the construct that IQ tests measure. The validity evidence for IQ tests indicates 
relatively strong predictive validity for perf ormanir^Tn~aeard5Mlc HetriligST^htS^ 
relationship is certainly not /trivial, and can be shiown^XaJae-^relat^nio other ,var- 
iables such as occupational, aptainraent (MatarazzoT 1972) . However, the relationship 
is somewhat limited. In recent work I have suggested the term "academic aptitude" as 
a more accurate characterization of what the VJISC-R and other IQ tests actually mea- 
.sure ^Reschly , 1979). Mercer (1979) suggested the term "School Functioning Level" 
(SFL) which appears to be motivated by the same concern regarding, reducing misinter- 
pretation of IQ test results. ^Changing the name is of course not a panacea for mis- 
interpretation. It is a step in that direction^ 

In view of the continuing problems with misinterpretation of IQ test results by 
consumers of test information, particularly parents and teachers, we develope-d the 
following statement for use in school psychology practlcum work at Iowa State Univer- 
sity. We believe the statement might bfe used as a kind of "Surgeon General *s Warn- 
ing" about IQ that should appear on reports, protocols, and perhaps, in test manuals. 
It is consistent with our belief that misunderstanding IQ test information could be 
damaging to the "psychological health" of the child. 

IQ testis measure only a portion of the competencies involved 
with hjiman intelligence. The IQ results are best seen as 
predicting performance in school, and reflecting the degree 
to which children have mastered middle class cultural symbols 
and values. This is useful information, but it is also limited. 
Further cautions-lQ tests do not measure innate-genetic capacity 
and the scores are not fixed. Some persons do exhibit signifi- 
cant increases or decreases in their measured IQ. 

^k'm sure the statement could be improved. Perhaps the task of developing an ap- 
propriate statement should be referred to the committees in NASP and APA Division 16 
that deal with social issues which ! beXleve this eer^«-*nly is. In any e^ent, it 
reflects our desire to reduce liisinterpret^ tion of If .s, which yield information 
we consider and encourage others to cofTslder as valuao , but limited. 
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Selection Ratios. -\ . 



One of the mosTt important social consequences of the use of IQ test information 
in classif^ication and programming decisions is that disproportionate numbers of cer- 
tain minorities may be deemed eligible for special education programming Overrepre^ 
sentation of minQrities in special education programs was the initial complaint in 
the placement litigation of the early 1970s where the courts implicitly used the 
rather simple notion of selection ratios as evidence of bias (Reschly, 1979). While 
we are considering the issue of overrepresentation, some clarification of the per- 
centages cited to establish the disproportioi.ality is in order. li\ the ^arrX-fj. <^3se, 
indisputable 'facts were that B lack students constituted about 10% of the total student 
^n^ollment— in^^he-Galifor^ia-public schools, and that about 25% of the enrollment in 



special classes for the mildly retarded was Black. I suspect that many have made the 
totally erroneous conc^tlusion that many if not most B lack students were in programs for 
the mildly retarded. The analysis of California enrollment data in Tabli? 5 indicates 
that only a small percentage of Black students were placed in 'special classes for the 
m ildly r eta r ded^ — T hese dat ar utnrcaitTiy ^ ntxt , suppc^rt- the t^x-treme ^ it ie ism chat the 
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pifimary purpose of IQ tests is to label minority children as "uneducable", 

Table 5 

Analysis of California Enrollment Data 

— - • 1968-69 J976~77 

Total student "enrollment 4,500,000 4,380;000 

Total block enrollment (10%) 450, OpO 438,000 

Total enrollment in special Em classes 57,148 19,289 

Black enrollment in special EMR classes 14,573 4,899 

(25.5%)' . (25.4%> 

Percent of total student enrollment placed in special EMR classes* 

1968-69: 57,148 4,500,000 = 1*3% 
1976-77: 19 ,'289 4,380,000 = 0.4% 

Percent of black children placed in spe . EMR ciasses. 

1968-69: 14,573 450,000 = 3.2% 
1976-77: 4,899 438^000 - 1.2% 



The precise role of IQ tests, most often the WISC-R, in the referral, assess- 
ment, c lass if,icat ion, and placement process with minorities is not entirely clear. 
The courts seemed to assume^ that IQ tests were the primary factor in this entire 
process, and thereby the major cause of overrepresentation of minorities. This 
assumption is probably an oversimplification of the act*ual course of events. Meyers, 
Sundstrom, and Yoshida (1974) pointed out that IQ testing follows ^teacher referral ^ 
and therefore is not the^first nor perhaps, even the primary step in the process. 
Mercer (1973) reported that some children with IQs below the eliglbiliti^ cut off 
scores ate never referred (and therefore not assessed or placed )^'while some others 
with IQs above the cut off scores are referred and assessed, but not placed. This 
raises an intriguing question. What has been the overall effect of the WISC-R on 
proportions of minorities classified and placed? Is the effect of IQ testing to in- , 
crease or decrease the overrepVesentation, that vould occur 'f the primary criteria 
for placement were classroom grades and teacher referral? Although the data on this 
issue are quite limited, there is some evidence indicating the overall effect of IQ 
tests is to protect, minorities from raiscl^ssif ication (Ashurst and Meyers, 1973). 
More data on proportions of childr'^en'^om, diverse groups \fho fail on various criteria 
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at the different stages of grades, referral, formal assessment, classification, and 
placement would clarify this issue. * Although there is much* opinion to the contrary, 
it appears likely -that IQ tests have served to reduce, not increase the proportions 
of minorities classified and placed in special education programs. ^ 

Application of the multif actored assessment requirements may reduce the over- 
representation of minorities in the future though this is not clear at the present. 
The overall effect of using a broader variety of information on classification and 
prograinming with minorities wiix likely be determined by how adaptive behavior and 
sociocultural backgrouild are conceptualized, measured, and used. If adaptive be- 
havior is conceptualized narrowly as nonacademic socia.l role performance Only., mea- 
sured with instruments such as the SOMPA Adaptive B ehavior Inventory for Children, 
(ABIC) with a low score required for 'classification and placement, then overrepresen- 
tation is likely to be reduced, perhaps substantially (see later section). Use of 
Sociocultural information to reinterpret Vr/ iSC-K scores as in SOMPA:~mlghi: also have 
the effect of reducing tfie overrepresentatlon. We can only speculate, on , the question 
of whether these changes would be beneficial to minorities. 

Summary 

Is theWISG-R biased against minorities? Is theWISC-R valid when used with mln 
ority children? This section has been devoted to a' discussion of these seemingly 
simplie questions. However, the answvrs are cotnplex and tentative. Decisive and un- 
equivocal conclusions aire impossible due to the diverse conceptualizations of the 
basic problem of bias and the sorcctjhat limited data base. 

Conclusions regarding validity and bias of the WISC-R wlt;^j .ipinorities obviously 
vary depending on the definition of bias. As noted earlier, t'liere^ is no singje "cor- 
rect" definition of bias. ifM'ef initions are used which stress ^^rious interval and 
external criteria, tha research evidence suggests the WISC-R is both valid and un- 
biased when used with minorities., .Other definitions which stress mean differences, 
selection ratios, and the soqflal consequences of test. use result in the opposite con- 
clusion. I.e., that the WISC-R is biased and depending on the value judgments applied 
to. specif ic situations, perhaps invalid as well. \^ 

The reassuring evidence regarding the internal and^ external validity of the W ISC 
R provides a foundation for our c-fforts to eliminate the other possible sources of 
bias. 'Of particular concern is the. evidence that the results from intelligence tests 
such as th€^ WISC-R have sometimes been misused to justify race, class, and ethnic dla 
crimination; have sometimes heen misinterpreted as indicating innate potential; and | 
have been part of a process whereby minority students V7ere placed in programs ,that ^ 
all^egedly were ineffective. These undesirable social consequences of test use,al- ^ 
though not universal and not an intrinsic characteristic of the test, have been too ^' 
common for us to ignore. Elimination of discrimination, correct interpretation, and 
effective interventions are essential components of the effort to ensurfe useful and; 
fair assessment for all persons. The WISC-R can be a valuable instrument in that ef- 
fort. 

OUTCOMES CRITERION 

The most damaging allegation by minority critics of intelligence ucademic ap^- 
tude) tests is that through their use minority children have been differentially ex- 
posed to ineffective educational programs which also had the effects of creating ' - 
stigmas, reducing self-toncept , and restricting career opportunities. Based on the 
review of the* WISC-R to this point, it would appear that the fundamental problem Is 
.the outcome of test use, not the test *per se. 'This allegation, however, is serious. 

i£ 30 '■ I .. 
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Failure to understand "his concern has probably contributed to the poor^ coramunica- , 
tion between critics i proponenta of tests like the VriSC-Rt VTe (proponents) 
have focused on various internal and external criteria of validity while the critics 
have raised the broader, and clearly legitimate, Question of what happens to min- 
ority children as a result of test use* 

One result of test use with minorities has been overrepresentation in special 
education programs* Are these programs effective? The evidence to date, although 
enormously complex, is not particularly positive at least for the special class kind 
of intervention. It should be noted 4:hat: 4:his~evid^ce— is the suhj^t^ of considerable 
ndrebate (Kolstoe, 197J5). However, if the special class programs are as ineffective as 
some critic* charge (e.g/, Dunn, 1968), then no child regardless of ethnic or^racial 
status .should be placed in the programs ^ ^ 

-In--an ^f-^>rt 4;o— f-ocus-^attenfeion-on -4Ae- what was conc e iv e d as th e ov e rrid d ing i s—— 
*^ue in test bias, i.e*, the outcomes of test use for the individual , the following 
definition of bias in assessment was developed* 

Assessment which does not result in effective interventions 
> should be regarded Wuseless, and^biased or unfair as well, 

if ethnic or racial. mJht>rities are differentially exposed to 

-inef-f^et-ive-programs-as--a--resul:t--of— as^essment-^ct^^^ 

(Reschly, 1979). 

The two essential components of this definition of test bias are usefulness and 
fairness. Usefulness in the sense of assessment resulting in effective interventions 
tha^ it^ov6 skills, and competencies, and thereby .enhance opportunities, is a para- 
mount goal^ of school psychological and special education services. The usefulness />dE 
assessment instruments such as theWISC-R should be determined on the basis of the 
degree to which they contribute to realisation of th^s goal* It is acki^owledged that 
there ate some instances in which assessment leads to accurate diagnoses for which 
there are no known effective interventions. The.4e diagnoses may still be "valid" in 
the 3 ens e of validity used by Cromwell, Blashfield, and Strauss (1975), if they improve 
est;lmations of prognosis or contisibute to prevention of the condition in future cases. 
Hc^wever, accurafe prognostic estimates or prevention of the condition in future cases 
are rarely of benefit to the dndividual being assessed if effective interventions can- 
/hot be developed* 

In this conception of bias in assessment the concern for fairness is closely re- 
lated to, the notion of usefulness. Assessment and accompanying diagnoses are seen 
as biased or unfair If they result in overrepresentation of jniborities ir programs 
that are ineffective, or in no planned interventions at all. Under such circumstances, 
the diagnosis may be accurate and the assessment conducted competently, but it la dif- 
ficult to identify any benefit to the individual. Moreover, if there is" a negative 
connotation or stigma associated with a diagnosis which occurs more often with individ- 
uals from minorities, the assessment leadiqg to that diagnosis would be regarded as 
biased or unfair in the above circumstances. On, the other hand, assessment which leads 
to accurate description of current behaviors, to diagnoses which are essentially sum- 
mary statements of these behaviors, and to effective interventions, should be regarded 
as fair or unbiased regardless of the ethnic or racial composition of student groups* . 
Over or underrepresentation of minorities in various classifications or programs is 
thetefore not sufficient to establish bias from this conceptioft* 

A number .of factors can be identified as prerequisites to achieving fairness in 
assessment using this approach (Reschly, 1979). However, the more narrow test based 
criteria discussed earlier in this chapter are usually necessary conditions for fair- 
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ness in assessment. In order to implemant this conception of nonbiased assessment, 
the tests used must •be reliable and valid for all groups 5 the test results must not 
be unduly affected by situational -examiner effects; and the content of the tests 
must reflect important domains of behavior for ail groups. These conditions are all 
necessary > but not sufficient conditions for assessment to be useful and unbiased. 

• PREREQUISITES TO NONBIASED ASSESSMENT ^ 

In this paper, nonbiased assessment is defined in terms of outcomes for the 
individual * Assessment that is useful in relation to providing effective educational 
and psychological interventions is regarded as fair, and beneficial to the individual* 
Valid and reliable, assessment instruments are necessary conditions to achieve this 
goals. Other variables such as what to assess, the link between assessment and pro- 

gramming, effective alternatives, etc. also are neccssarv conditions . The broader 

context for this discussion is 'good fundamentals in assessment and ethical profes-- 
sional practices. < , 

Good* Fundamentals ana Ethical Practices 

This is not the proper forum for an attempt to specify, all the competencies needed 
by r elated services pers onnel^" or the major provisions of professional ethics. How- 
ever, these areas are crucial to fair and useful assessment. In some of the placement 
bias cases there were well documented ^instances of outright incompetence and clearly 
unethical practices. Although these cases are probably rare, they do establish the 
need for all of us to assume dirept responsibility for the quality of our services, 
and indirect responsibility for the professional work of our colleagues. ^ ^ 

Clarification of Purpose <> 

Clarification of the purpose for assessment activities is an important, but fre- 
quently ignored aspect of good fundamentals. * Salvia & Ysseldyke (1978) provide an^^ 
excellent description of the usual purposes for assessment in remedial and special ^ 
education. Related services personnel such as psychologists and social workers typi- 
cally engage in assessment activities for two purposes; Classification/Placement or 
Program Planning/ Intervention. These two purposes usually involve different types 
of decisions and different types of instruments. 

The Classf fication/Placement purpose typically involves decisions about current 
level of performance, degree of discrepancy from grade or age expectancies, degree 
and type of need, and eligibility for special programming. The questions typically 
are addressed from the perspective of a comparison of the individual student's per- 
formance in relation to some group, usually a representative .sample of other students. 
In recent years these comparisons have been called norm referenced . 

Assessment instruments ^and other data collection procedures for classification/ 
placement decisions should meet certain requirements. The items should be representa- 
tive of some domain of behavior. * The sample of items (or observations) should be suf- 
ficient to infer the individual's level of competence in the area. The inferences 
about degree of discrepancy from expectations should be based on comparisons to a rep^ 
resentative s£imple, i.e., good norms. The scores used in these comparisons should 
have relatively .equal units throughout the scale, and so qn. The scores should be 
highly reliable if decisions are made about individuals. If the scores for a partic- 
ular instrument are not highly reliable (?..g.,^.9 or above) then multiple sources of 
information using different instruments or data collection procedures should be devel- 
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oped and considered in making decisions. Finally, if inferences ar^ made about under) 
lying traits sucWas intelligence or psychological processes, the instrument must hav> 
good predictive validity relative to appropriate critrion behaviors in educational 
settings. 

Program Planning /Intervention decisions require somewhat different types of as- 
sessment information and different types of instruments. ._Rather than general degree 
of need or overall strengths and weaknesses, information is needed on Very specific 
skills or competencies. Data collection from this perspective, ofteij called criterion 
referenced now, is designed to pinpoint precisely what the child can and carinOt do in 
some important domain of behavior. , The items on such instruments shoul*^ provide thor- 
ough coverage of the important skills or competencies rather than representative sam- 
pling.- The items or observations should be related to important objectives and, ideal- 
- ly, to clearly specified interventions. 



\ 
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\ Most current instruments or observation grocedures do not meet the necessary cri- 

\teria for both purposes. In nearly all cases, a particular instrument or observation 
^procedure has desirable characteristics for norm referenced, classification/placement 
decisions OR criterion referenced, program planning/intervention purposes. Of course, 
many instruments do not meet W criteria for either. Many of the mistakes in assess- 
nient work originate in failure to clarify purpose. Sometimes we attempt to use the 
same instrument for both purposes, e..g-.T use of thg-WISC-R-to su:ggeslr^ducatl-ona-l-pro— 
, gramraing objectives and to determine eligibility for special programs. The WISC-R has 
" many desirable features for certain classification/placement decisions. It is largely 
irrelevant to specific decisions about educational programming. 

I . •' ' ' . 

/ • Clarification of purpose will lead to different and more varied strategies in as- 

/ sessment . •> 
I Relevant Assessment ' 

Assessment which meets the outcomes criterion suggested in this paper must be rel- 
evant to educational prbgramming, or in the words of the PL 94-142 Rules and Regula- 
tions, "...'tailored to assess specific areas of educational need...". A number of cur- 
rent trends in assessment practices enhance the rel .vance of assessment . 

Assessment-Intervention-Evaluation . Assessment for classification/placement is 
important, but insufficient in relation to the- outcomes criterion. Related services 
personnel increasingly have the opportunity to be involved with other types of asses- 
sment such as assessment "for: 1) Decisions about special education program option, • 
e.g., resource vs special class; 2) Intervention goals; 3) Intervention strageties; 
and 4) Evaluation of intervention outcomes. In addition, school psychologists and 
social workers have opportunities to use behavioral consultation strategies in the 
home and school. These strategies, involving behavioral assessment procedures, -re- 
fleet one o!f the clearest examples of the Overall link between assessment. Intervention 
and. evaluation of outcome (Bergan, 1977). ^ ^. 



■ The PL 94-142 requirement that a member of the diagnostic team serve on the /com- 
mittee which designs the initial lEP provides the opportunity for most related ser- 
vices personnel to become more involved with decisions about interventions. Many 
^11 have opportunities to participate in annual reviews, . xd nearly all will be in- 
volved with the mandated re-evaluations every three years; The three year re-eyal^a- 
tions aire often downgraded in the priorities of related services personnel. This 
indeed unfortunate. CTe of the important »<uestions in this re-evaluation is classifi- 
cation or continued eligibility. Perhaps even more important is careful evaluation of 
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che effectiveness of the special education programming, and examination of the areas 
*of educational need. How vjq view the re-evaluations will be heavily influenced by 
whether we see ourselves as classifipation personnel, OR whether we adopt the out- 
comes criterion. ^ Nevertheless, tjjie opportunities now exist for significantly greater 
involvement in all phases of desigtilng, carrying out, and evaluating interventions. 

Reduced Level of Inference ♦ Relevant assessment involves a lower level of in-- 
ference. School and other areas of applied psychology have an unfortunate tradition 
of combining "clinical insight" with very minimal data resulting in global descrip- 
tlo i is of p ergons^ — ^la n y of t he ^andard Interpretaticns of test resxrlt^s involve anal- 
ogical reasoning with little or no empirical support. The analogical reasoning tised 
in the interpretation assumes that ajlogical relationship exists between the observed 
behavior and underlying dynamics, or, nothing*' is quite what it seems to be . Usually 
the empirical support for the interpretation simply does not exist, or the strength 
, of the" relationship, although statistically significants, is soT:ow~that prediction " 
for individuals is hazardous at best. An example may clarify these points. A common 
interpretation of dark, heavy lines on the Bender designs is "repressed hostility" 
even when the designs are reproduced accurately. This "emotional indicator*' is fre- 
quently discussed, in reports without any additional or external verification even ^ 
though the empirical evidence is weak (Koppitz, 1975, p. 85). These "signs" may pro-^ 
vide cues to important behaviors that should be assessed in relevant situations. H ow- 
^ever, the sjf^n as such~~is based largely on analogy, likely to Se inaccurate for~tRe 
individual, and even worse, may impede efforts to develop interventions. Similar rea- 
soning aad interpretations for a ^variety of other tests are found in standard clini- 
cal texts (e.g., Rapaport, Gill, & Schaefer, 1968), which are frequently used in school 
psychology training. . 

Another change related" to the reduced level of inference is less emphasis on under- 
lying dynamics. The frequent question at/ ^taffings after potentially useful objective 
information is presented is 'Vhat is really going ^?" This question often serves as 
a cue for pll manner of speculation about "pathological" family dynamics, who perceives 
whom as what, juicy anecdotes about sexual proclivities, and so on. These speculations, 
and the high level of inference upon which they are based, might be useful IF effective 
interventions were the result. The usual outcome, however, is participant satisfac- 
tion over their apparent insight and understanding regarding the problem. These under- 
lying dynamics are rarely used to design interventions if for no other reason than the 
impossibility of influencing the variables involved. If the question of **What is really 
going on?" leads only to speculation without specific interventions, then the entire 
exercise should be regarded as professional voyeurism. At a minimum, it is useless as- ^ 
sessment. / 

There are several trends which will continue to moyc the field toward a reduced 
level of inference and less emphasis on underlying dynamics. One influence is the courts 
as well as the quasi-legal appeal procedures established as pert of the due process reg- 
ulations. Speculative inferences based on minimal data have not been well received by 
the courts (ZJLskin, 1975). Another influence is the PL 94-142 requirement that tests 
be validate^ ior : ne jSirposes for which they are used. Presumably, testimonial evi- 
dence from satisfied clinicians will not suffice. Finally, the strong emphasis on de- 
signing interventions, and on review and evaluation of interventions will necessitate 
greater con3ideration of other more useful information. 

' \ . ^' 

Situational or Behavioral Assessment . Behavioral or situational assessment is per- 
haps the mo$t rapidly expanding model of asfsessment today. The behavioral approach with 
the emphasis on precise formulation of goals, careful observation of situational factors. 
Implementation of specific interventions, and evaluation of outcomes is consistent with 
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many requirements of. recent legislation and the outcomes criterion proposed In this 
paper* ' • 

The behavioral approach has been regarded as too restrictive by many school psy- 
chologists. Somehow the emphasis of behaviori^ts on func^dioij/l control rather than 
explanation and understanding has appeared to reduce school psychology to technology 
rather than /'science" or profession* Those who have made these judgments in the past 
are encouraged to reconsider the question of theoretical model through reviewing the 
advances of the past decade in behavioral theory, assessment, and interventions (Bergan 
1977j^Cone & Hawkins 1977; Keller, 1980; Michenbaum, 1977). Attention also is di-^ \ 
"fected U) a recent article suggesting a behavioral perspective drf the use and Inter- 
pretation of intelligence tests (JTelson, 1980)* Behavioral models now include and 
operationalize broad classes of behavior such as cognitive style, social skills, anx- 
iety, eJtc* The behavioral 'assessme^ techniques have been refined to include a broad 
variety of instruments and oSservation methods for collecting useftrl information, m a ny 
of which are relatively unobtrusive in natural settings* Perhaps the greatest advances 
have been in the use of more natural interventions such as self-control, cognitive self 
instruction, rehearsal, modeling, and naturally occurring reinforcement contingencies* 

Placement Options and Effective Programs 

^If we accept the notion that possible bias in assessment is best conce ptua lized in 

terms of outcomes, then the ayailability of effective educational programs and alter- 
native placement options ifi an absolute prerequisite to Implementing nonbiased assess- 
ment procedures • In the situations which resulted in the special education placecont 
litigation, the educational programs were prestmed to be ineffective and the range of 
options. limited* The author remembers all too well the very limited range of options 
that jWas typical until quite recently r The Only choices often were regular classrooms 
with |ifo assistance or self-contained, segregated classes for the mildly retarded* Many 
psychologists can recall vividly cases where we knew the child was not "really" retard- 
ed, but in view of very low achievement accompanied by increasingly negative attitudes 
toward school and self, the self-contained, segregated class appeared to be the best 
option* 

This situation has changed, or is in the process of change* A wide rang^of op- 
tions are increasingly available, the principle of using the least restrictive alter- 
natl|i^e is the law of the land, and greater emphasis is placed on effectiveness of in- 
terventions through individualized educational programs with annual reivew. These 
changes provide the opportunity for assessment activities in a broader variety of 
areas* In addition to classification decisions, assessment should be directed toward 
decisions concerning choice of least restrictive alternative and toward the content 
of interventions, especially identifying specific areas of "educational" need in terms 
of social, emotional, and academic development* Assessment should also yield infor- 
mation concerning the Approach to intervention, specifically, changes in antecedent, 
situational, and consequent environments that can be used to carry out interventions. 
Finally, we need. to gather information that is^ relevant ^o and/or can assist othets in 
evaluating the effectiveness of interventions. 

Multlfactored Assessment ' ■ ^ 

. The concept of. multlfactored assessme^nt apparently was the primary solution to 
the dilemma of defining and describing the requirement of nonbiased* assessment in the 
PL 94-142 Rules and Regulations* The underlying (and logical) assumption is that 
assessment is likely to^ be less biased ,lf a iroad variety of information is collected 
and considered systematically in making classification/placement decisions*. This as- 
sumption is sound, but insufficient. ^ Improved classification decisions are certainly 

3d ^ 
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iinportant, but even^more important is the use of the multif actored information in 
designing and evaluating interventions-* 

■» 

Tucker (1977) provided a description of the categories of information which 
should be developed in a comprehensive assessment of children "for possible mildly 
handicapping conditions." For the most part, the categories of information are 
fairly standard and largely consistent with traditional descriptions of comprehensive 
psychoeducational evaluations. The arrangemen*: of the catego^ries of information, 
especially the sequence suggested for collecting the. information, is somewhat unique. 
These nnte^ories have been further modified through the concepts of low and high In-- 
ference procedures in the scheme presented in Table 6. It ^hould be noted that sev- 
eral activities should occur before the preplacement evaluation is inpLtiated (See 
Guidelines at the end of this paper) . Among these activiti^es are screening of refer- 
rals, clarification of referral problem(s), interventions ^ithin regular education, 
"etc. — rr these procedures are followed, j:7^,~ tt-^^-d^texmhred-that a severe dis- , 
crepancy exists and- regular • education alternatives have been unsuccessful, then the 
preplacement evaluatioja should be initiated. / ' / , 

Table 6 ' / * ^ 

Multifactored Assessment ^ 



A. SCREENING PHASE 

1. Iteferral . Clarify referral through teacher ititerview, classroom ^observation, 
and examination daily work, / . ^ 



2. Educational history . Review current and previous educational records includ- 
ing special services, classroom/. performance, standardized tests, 
etc.. Consider use of regular education options and interventions. 




IF TH^ IS A SEVERE DISCREPANCY, OR IF THE DEFICIT IN PERFORMANCE IS COMPREHENSIVE AND 
LONG TERM, IF REGULAR EDUCATION OPTIONS HAVE BEEN ATTEMPTED UNSUCCESSFULLY, THEN, 
INITIATE THE PREPLACEMENT EVALUATION, ^ . 

'B. PREPLACEMENT EVALUATION (Initial Phase) 

3. Procedural Safeguards . Follow procedural safeguards to meet legal requirements 

and to establish communication with home. 

\" . » . 

4. Multidisciplinary Team . Form multidisciplinary team, develop hypotheses, 

tailor the preplacement evaluation to the individual, assign 
responsJ.bilities and* establish time lines. 

C. ^ PfePiACEMENI EVALUATION (Low Inference) 

5. Sensory S creening, Healthy Developmental > If needed, physical examination 

(If needed), health and developmental history, and sensory 

assessment by specialists. 

✓ 

6. . Language Dominance . Determine the child *s primary language competence through 
formal measures and/ or hoifie interview. 

7. Educational Evaluation > petermine level, pattern, strengths and weaknesses 

in academic skills through formal and informal measures administered 
and interpreted by specialists. 
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D. PREPIACEMENT EVALUATION (High Inference) ^ 

8. Perceptual-Motor/Psycho logical Process > Determine if severe process deficits 

are felat^d to learning problem through administration of formal 
^nstrtimenp, observation, and interview. ' ^ 

9. Adaptive Behavior-Outside School * Investigate sofcial competence outside of 

school through structured and uns true tured_ interview. 

10. Social/Emotional * Determine nattire and extent of social/emotional involvement 

or behavior disorders through interviews, observation, checklist, 
etc. 

11. Intelligence (Academic Aptitude) . Determine general level of expectations for 
— ~ : academic achie v ement th r ou gh: adm l nistr a L iun of ludlv l du arl lutel--- 

ligence test. , . ^ ' * 

DECISION-MAKING / . 

(See Guidelines at end of paper) 

Consider this criterion: Would you be satisfied IF YOUR CHILD HAD BEEN INVOLVED/ 

^ ...^^^^ ^ . ASSESSMENT PROCESS? 



^ There is nothing new about the concept of a multifactored assessment* Profes- 
sional Standards have always emphasized the importance of collection and consideration 
of a broad variety of information as a part of any significant classification/place- 
me nt decision. Implementation of this notion has been less consistent. Even more 
troublesome, documentation though reports and other records of the full multifactored 
' prbcess has not been universal. For example, in presenting a comprehensive record for 
a child classified and placed in special education, it is important to thoroughly de- 
scribe the initial referral^nd educational history, not just the intelligence test 
data. In the past, the re'S^rds for students in special education prog^gms often had 
little information beyond the intelligence test results. Other types of information 
probably were collected and considered in most cases, but were not documented. 

The recent versions o^the multifactored assessment reflect greater emphasis on 
sources of information other than intelligence test data. This suggests the very pro- 
per concern that intelligence not be the sole or primary source of information for 
classification/placement decisions. Moreover, the. information collected as part of 
the low inference procedures described In Table 6 will, in some cases, significantly 
influence the selection, administratiisn, and interpretation of high inference pro- 
cedures such as intelligence tests. Fdr examf>le, ""some among us (related services per- 
sonnel) have, had the embarrassing experience...of admlniaterina a verbal scale to a_ 
hearing impaired child, or a performance scale to a child who needed (an,d had) glasses 
^bjut wasn't wearing them that day. These kinds of errors are humorous if corrected, 
but potentially tragic if allowed to' stan^» The point is that the low inference pro- 
cedures should always be conducted before the high inference procedures. 

Recent versions of multifactored assessment reflect more emphasis on the three 
areas of adaptive behavior, prima^ language, and sociocultural background. These are 
not totally new areas of assessment. However, the Implicit /and sometimes explicit re- 
qui tements. that they be assessed systematically and considered carefully create dif~ 
* ficult challenges for related services professionals. In subsequent sections, the con 
:^£eptusl and technological bases for these areas will- be reviewed. 

^ . ' ^ 37 
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• THE SYSTEM OF MULTICULTURAL PLUIULISTIC ASSESSMENT^ ^ . 

Discussions of bias In assessment are incomplete without consideration of the 
System of MulticukuVal Pluralistic Assessment (SOMPA) (Mercer, 1979), The SOMPA 
models and meas.jnres are particularly relevant to discussions of adaptive behavior 
and soclocultur'al background. Much of the rationale for SOMPA is based on an epi- 
demiological study of mental retardation in Riverside, California. 

Mercer's Riverside Studies ^ 

• _ ^^^^ _ 

. At about the time that national concern was increasing over the six hour re- 
tarded child (see later section), a sociological analysis of the process whereby 
persons were diagnosed as mentally retarded appeared in the literature (Mercer, 
1970, 1973). Although the major findings of Mercer's Riverside, California study 
_J^^fe-^o--&urp^ise~to-i>TO^f-^^^ -personne l i n ment a l ret a r da t i on a nd s pecia l ed uG^ 



tidn, thei^conclusions reacheJ by Mercer called for substantial changes in assess- 
ment practices. Of particular importance was the call for greatly increased empha- 
sis on adaptive^ behayipr and sociocultural information. 

The major findings of the Riverside Study were that public schools were by a 
large margin the community agency moot likely to diagnose persons as mentally re- 
tarded-^- -Ih-eompari^on-ta-other community agencles^^r the Riverslde ^choola placed- 
more reliance on the results of i"ndividual" intelligence tests and used a higher IQ 
cut off score (79 rather than 75 or 70) . Persons classified by public schools as 
mentally retarded were often poor, of minority status, and situationally retarded. 
.Mast were regarded as normal by their families and had not been diagnosed as re- 
tarded prior to entering the public school. -Mercer attributed these findings to, par 
ticularly the overrepresentation of minorities, the use of a hIgTier cut off score 
by the-schools, the failure- of the schools to assess adaptive behavior, and tl>e 
biases in the IQ tests. <^ . ' 

The findings reported by Mercer which apparently have been influential 
litigation and legislation came as no surprise to persons familiar with the liter- 
ature on mild mental retardation. For example Heber commented iiT the 1961 AAMD 
Manual, "Impairments in learning are usually most manifest in the" school situation ^ 
and, JLf mild in degree, may not even become apparent until the child enters school" 
"(Heber, 19{&2, p. ^3). Further, Farber (196fi) reviewed prevalence studies in mental 
retardation and reported higher rates for mild mental retardation among the econom- 
ically disadvantaged, and a peak prevalence at the ages of about 10 to 14. Mild 
mental retardation,, in contrast to the more severe lev3ls of mental retardation, 
, has been known for decades to be more prevalent among the poor and economically dis- 
advantaged minorities; to be more common during the school age years; to be largely 
situational or school related; and impermanent..^ 

Mercer's analysis of the Riverside studies also reflects some misconceptions 
about the complex process in the public schools whereby children are' classif led as 
mildly retarded. The actual role of standardized tests are clearly exaggerated. 
There are few if any instances where cases of mild mental retardation are sought 
through group standardized tests of ability or achievement. The use of group abil- 
_lty tests appears to have declined in recent years, and in any event, the results 
of such teBts~have never been a significant factor in the classification of students 
as mentally retarded. ' ' 

—The most slgniflcanr step in the process whereby students are classified as 
mildly retarded is teacher referral due to poor performance in the classroom. Re- 
lated services personnel 4p not go out to schools and attempt to catch unwitting 
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victims with their psychometric nets. The only children to whom Individual Intel- 
ligence tests are administered .are those who have been .referred. Mercer gave 
slight attention to the Importance of teacher referral. She reported that 72% of 
the students classified as mentally retarded by the schools had repeated one or 
ihore years prior to classification. The grade retention data suggests that the 
problems experienced with the classroom situation were chronic rather than tempo- 
rary, and that at least some minimal altern^tives^^were attempted within the regular 

classroom. - " , 

V 

Mercer (1973) contende^d, however, that referral rates were not different among 
white, black, and Hispanic students. If the referral rates were not different, the 
clear implication is, that what happened after referral was primarily responsible for 
overrepresentation in the programs for the Educable (Mild) Mentall^-Sfetarded. What 
happened after referral in most cases was oiL^durse ^n-^Tidlv^ eval- 
uation by a school psychologist in which an intelligence test was usually adminis- 
tered. However, "the referral r^te data reported by Mefcer Included ^r"c^ses, no^^J 
jus^t those referred for academic problems. Students referred for possible identifi- 
cation as gifted were lumped together with those referred for academic problems. 
Data on the ^racial/ethnic composition of the. students referred for academic prob- 
lems were not provided for the Riverside Study* - ; 

Othe r data sources suggest that significantly more economically dj-sad'^ antaged 
and minority students are Tefeicred due to academic prbblems CTomlinsbni Acker, 
Canter, & Linborg, 1977).. The effects of psychological evaluation incl.udfng intel- 
ligence ^ testing on the population of economically disadvantaged minorities referred 
for learning problems has not been studied adequately. ' Some data suggest that in- 
dividual psychological evaluation including intellectual assessment serve to protect 

minority students from inappropriate classification as mentally reltarded. Ash_urst_ 

and Meyers (1973) reported results from an analyaiis-of all students (N = 269) refer- 
red over a three-year period as suspected cases of mental retardation. These data 
were also from the Riverside, California public schools. 'Referral rates were con- 
siderably higher for 'minority students. The effects of psychological evaluation 
.were to reduce, ndt increase, the overrepresentation of minorities that would have 
resulted from teacher referral. Contrary to the suggestions from Mercer's analy.sis 
of the data from Riverside, intelligence test results provided some protection of 
minority students from erroneous classification. ^ 

Although the precise role of Intelligence tests in producing overrepresenta- 
tion of minorities in programs for the mildly retarded continues to be a source of 
debate, other issues from the debate over the^^lx hoUi, retarded child are equally 
Important. The question of whether ecotiomically disadvantaged minorities are over- 
represented due to socioeconomic status (SE3) or minority status has not been stud- 
led sufficiently. Mercer (1973) concluded that SES accounted for some but not all 
,of,the overrepresentation of minorities. However, the actual dlata on mean SES leT^els 
of the total EMR population and for the different-racial/ethnic groups in the EMR 
population were not reported. Other concerns to be discussed later Involve the con- 
ceptions of mild mental retardation and adaptive behavior for strhool age children. 

The conclusions reached in the Riverside^ Study and from the broader concern 
for the six hour retarded child have had a profound influence on related services 
disciplines and special education. Mercer (1973) recommended. three major changes 
in the diagnostic procedures used in the public schools.' First, it was suggested 
that the IQ cut off be lowered to the traditional criterion of about two standard 
deviations below the mean rather than the higher cut off score used then, and now, 
in many state education codes* The major justification for this change was, "At 
this criterion level, persons are least likely to be labeled as retarded who, as 
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adults, will be able to fill a normal complement of social roles" (Mercer, 1973, 
p, 221). Implicit in this recommendation is the view that "true" mental retarda- 
tion is a permanent condition. The second recommendation was that adaptive behav- 
ior should bjB emphasized more in classification decisions. Accompanying this rec- 
ommendation wi^s a broadening of the conception of adaptive behavior in comparison 
to the 1961 AAjffl Manual (see later section). Finally, pluralistic norms were ad- 
•vocated for the purpose of cgrrecting the bias in IQ tests. The SOMPA represents 
Mercer's attempts to implement the last two recommendations. 

SOMPA Models and Measures , 

SOMPA is a highly complex and innovative approach that has been the subject 
of much, sometimes acrimonious, debate (see No.*s *1 & 2 of Vol. 8, Schoojl Psychol- 
ogy Digest ). I encourage all school psychologists to study this approach carefully, 
and look forward to r esear ch on applica tions of SOMPA. T he u nfortunate ^trend cur-. 



rently is toward extreme reactions, positive and negative ranging from those who 
"feel" that, it is the ,best thing that has ever been developed to those who "feel" 
that SOMPA represents a diabolical plot against school psychologists, special edu- 
cators, children, and so on. The debate has often been useful, but susjension of 
judgment until more emjiirical information is available is clearly Indice ted. The 
author's, publisher's, and critic's claims, and views notwithstanding, we need much 
more information before reaching f irm ^conclusions. > ^ ' 

At the present, SOMPA provides three major innpvations concerning assessment 
practices. The specification of three models of assessment in terms of assumptions, 
values, and appropriate instruments is one of the major components as well as a 
controversial aspect of the system. A second innovation is the development of new 
instruments such as the Physical Dexterity Battery, Sociocultural Scales,. Health 
History Inventory, and Adaptive Behavior Inventory for Children (ABIC). Many of 
these initruments will be useful data collection devices regardless of the outcome 
of the debate on other features of SOMPA. Finally, SOMPA combines the models with 
conventional and new data collection devices to develop a more refined classifica- 
tion system. It is important to note that the primary information from SOJMPi? au the 
present is of a classification, not programming, nature. Techniques to use SOMPA 
information in completion oB>/the lull diagnostic construct criteria (Cromwell, 
Blashfield, & Strauss, 1975) are at present not available. The ultimate usefulness 
of SOMPA will be determined by the degree to which the information provided is re- 
lated to educational placement and programming decisions i a point which the authors 
of SOMPA have also stressed . 

Several specific issues need to be addressed in the near future regarding uses 
of SOMPA. There is the question of the generalizability of SOMPA normative data to % 
other groups, e.g.. Native Americans, and to the same groups in different geographic 
regions^ e.g., Hispanics JLn the bJartheast. the SOMPA standardization* data are based 
on caxefully selected samples of children, but sample selection was restricted to . 
California. The population in Calif ornia, although diverse, is npt necessarily typ- 
ical of samples elsewhere, e.g., Anglos in Iowa; Blacks in rural Alabama,, or His- . 
panics in New York City. The authors of SOMPA suggest collection of data from ran- 
dom samples of children in different localities to determine if the S(MPA California 
norms ^nd regression fdrmulas are appropriate for specific groups of children. ^ Such 
studies, although expensive, are clearly necessary prior to widespread use of the 
system^ A second issue is related to tl?e' generalizability to other groups of the 
data on the relationship of the WISC-R to other measures in SOMPA. Finally, there 
is the issue of the effect^* on children, particularly in terms of educational clas- 
sification and programming, of use of SOMPA. Limited data on these questions are 
now Available and will be discussed in later sections. 
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ADAPTIVE BEHAVIOR 

\ 

Concern for what is now called adaptive behavior is not new. The term^sociaL 
corapetence was used prior to about 1960 to refer to approximately^ tfrensame con- 
struct. Social compatence or adaptive behavior has also ^een a fundamental concept 
throughout the history of efforts to describe and explain the phenomenon'of mental 
retardatipR. J ^ 

Although the construct of adaptive beh avior Is not new,. a number of recent 
events have led to considerably more emphasis on usie of adaptive behavior '^ata in 
special education classification and placement decisions. Revisions of the AAMD 
HanuaV on Te rminolog> and Classification in 1961 and 1973 reflected increasingly 
greater emphasis on adaptive behavior* The ^'normalization" effort which has the 
primary purpose of integrating institutionalized mentally retarded persons into 
community settings was a second major influence on adaptive behavior. Fretoi this 
perspective adaptive behaviors are viewed as the "reversible" features of the more 
severe levels \of mental retaliation (Leland, 1^8). Another somewhat unrelated 
trend was the emphasis on nonbiased assessment that resulted from litigation acA 
legislation In the 1970* s. Adaptive behavior from this perspective was seen as a 
means to reduce the emphasis on intelligence test results; to provide more equitable 
assessment for minorities; and 'to alleviate the overrepresentation of minorities in 
special education programs for the mildly retarded (Coulter and Morrow, 1978). 

■ W - • 

In view of the diverse influences and different purposes underlying the recent 
upsurge of interest in adaptive behavior^ it is not surprising that much confusion 
exists over the measurement and use of adaptive behavior data. In addition to 
these sources of confusion the recent fe^^ral legislation implies that adaptive be- 
havior data must be considered in all c^ecial edueatipp placement decisions. Per- 
haps the best recent source of information on adaptive behavior is a book edited by 
Coulter and Morrow which is cited earlier. Their discussion of unresolved issuer 
surrounding the adaptive behavior concept, available measures, and possible uses is 
rer ..^i^^ended highly. 

Adapt ivct Behavior and Definitions of Mental Retardatxon 

Fcf approximately two decadBS-^th?^^lu?^n) deTinition of mental retardation has in- 
cluded the diicgnsioriia--o#HffiL8lligence and adaptive behavior. However: the emphasis ^ 
on adaptive behavior was increased in the 1973 version- TRe 1961 version described 
mental retardation an subaverage general intellectual funct^hing which is associ- 
ated with impairment in adaptive behavior. The 1973 and 19/7 versions placed more 

, emphasis on adaptive behavior by changing "associated" to " existing concurrently ." 

• This Change toward placing relatively equa? ""emphasis on both of the dimensions of 
mental retardation along with the .subtle c .anges in the concept5.cA of adaptive be- 
havior from 1961 to 1973 versions constitute difficult challenges for diagnostic 
personnel. , ^ ^ 

By now it is, likely that WBt educational definitions of mental retardation 
include both the intelligence and adaptive behavior dimensions. According to a 
recent survey (Patrick & Reschly, 1980) about two-thirds of the abates required 
assessm€.at of adaptive behavior for one or more of the special ^ucation class if i- 
cation?, usually mental retardation. A number of additional states ref^-ted efforts 
to add adaptive behavior to the state definition of mental retardation. wever, 
the majority of states did not have a definition of adaptive behavior and much con- 
fusion was. reported concerning definition, domains of adaptive beha. ^ and avail- 
ability of measures. Although the statua of adaptive behavior in special educa.ljn 
undoubtedly varies from state to state, the trend is .toward more emphasis on this 
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. dimension at least with the mentally retarded. 

Conceptions of Adaptive Behavior 

One of the most influential definitions and descrfptlons of adaptive behavior 
Is provided in the AAMD Manual on Terminology and Classification . The AAKD concep- 
tion and criteria for adaptive^ behavior durl'^g the school age years changed in subtle 
ways from 1961 to 1973. Consider the following description from the 1961 revision. 

"Adaptive behavior refers primarily to the effectiveness or the 
individual in adapting to the natural and social demands ok his 
environment. Impaired adaptive behavior may be reflecited in: - 
1) maturation, 2) learning, and/or 3) social adjustment. Tnese 
three aspects of adaptation are of different Importance as dual-- 
Ifylng conditions of mental retardation for different age groups." 

"Learning ability ifefers to the facility with which knowledge is 
acquired as a function^ of experience. Learning difficulties are 
usually most manifest in the academic situation and if mild in 
degree may not -even become apparent until the child enters 
\ school. Impaired learning ability is, therefore, particularly 

important as a qualifying cpnditlon of mental xgtardatlon during 

the school years." • 

< 

Quotes from Heber, 1961, p. 3-4. 

, Using the description of adaptive beliaylor from the 1961 version one might fo- 
cus attention entirely on performance in the public school context for school age 
children. Adaptive behavior for school age children in this version appears to be 
based at least primarily on academic competence. For school age children this 
conception might be interpreted as specifying a diagnosis of mental retardation 
based only on intelligence, classroom academic performance, and results of stan- 
dardized achievement tests. Other characteristics and behaviors specified in cur- 
rent conceptions of a* raultl-factore^ assessment should have been and often were • 
considered in mild mental retardation classification/placement decisions. However, 
the clear implication in the 1961 ij^vlslon was that academic performcice was the 
most Important index ^f adaptive behavior for school age children. With consider- 
able justification, one could argie that^ up to 1973 when the AAMD Manual was re- 
vised, diagnostic personnel in the schools were assessing adaptive behavior as con- 
ceptualized at that time. • ^ 

^The, changes in conception of adaptive behavior for school age children in the 
1973 and 1977 , revisions of the A AMD Manual are Illustrated in the quotes below. 
As noted previously, the 1973 and 1977 revisions are virtually identical. 

"Adaptive behavior is defined as the effectiveness or degree with 
which an "individual me^ts the standards of personal Independence 
and social responsibility expected for age and cultural group." 
Grossman, 1977, p. 11. , 

"During childhood and early adolescence in: 

5. Application of basic academic skills in dally life activities 

6. Application of appropriate reasoning and judgment in mastery 
the environment . " ' -> r 

7. Social skills (participation in group activities and inter- 
personal relationships)" 

o ■ 42 
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"The- skills required for adaptation during childhood and early 

adolescence involve complex learning processes. This invol3>es 

the process by which knowledge is acquired and retained as a 

function of the experiences of the individual. Difficulties 

in learning are usually manifested in the academic situation 

but in evaluation of adaptive behavior, attention should focsds^ 

not only on the basic academic skills and their use, but also , * . . - 

on skills essential t Qope Witti the environment, including^ 

concepts of time' and money, self-directed behaviors, social 

responsiveness, and interactive skills." ^ ^ • - 

Quotes from Grossman, 1977, p. 13-14. 

o The recent revisions of the AAMb Manual placed more emphasis on adaptive be- 
haVior AlTO broadened the concept of adaptive behavior.^ during the school age years. 
It should be noted that contrary to some' recert trends in conceptions and measures 
of adaptive behavior, the AAMD conception does ^continue to include performance in 
academic settings as -an important component of adaptive behavior during the school 
age years. For children in this age group, scHool performance is a necessary part 
of the construct of adaptive behavior (see below). However, performance in other^ 
social settings should also be considered. ' * . * 

Other conceptions of adaptive behavior have been proposed in r.ecent years (see 
Coulter & Morrow, 1978; Reschly, 1980, 1981 for reviews). The common features of 
conceptions of adaptive behavior are emphases on developmental (age appropriate) 
criteria and consideration of cultural context. Conceptions qf adaptive behavior 
for sdhool age children differ sharply pn the issues of: 1) Inclusion. or exclusion 
of the cognitive competencies that underlie adaptive behaviors; 2) The social set- 
tings and social roles (school vs out of school) included; and 3). The data source, 
i e thirl party respondent or direct observation of the individual. In addition, 
conceptions anu measures of adaptive behavior have been developed for oifferent pur- 
poses, classification/placement vs program planning intervention, and for different 
populations, mildly retarded vs more sever ely^^retarded. 

Assessment of Adaptive Behavior 

The purpose of assessment, i.e., the decision that needs to be made about or 
with a student, is the most basic consideration in the selection of a formal meas- 
urement Instrument or informal data collection procedure (Salvia & Ysseldyke, 1978; 
Clarifying the purpose through explicit statements of the decisions to be made is 
particular|.y Important in the assessment of adaptivv, behavior. 

If the purpose of assessment is program planning/intervention with the moder- 
ately, severely, or profoundly retarded, the currently available adaptive behavior 
Instruments are reasonably adequate for most ages. Some instruments have been de- 
veloped carefully with rigorous measurement and statistical criteria applied to se- 
lection of items. A sample list of some of the more prominent instruments is pro- 
vided in Table 7 which is reprinted from Oakland and Goldwater (1979) . 

Although a number of adaptive behavior measures are listed in Table 7, it 
should be noted that only two of them are designed specifically for school age 
populations of normal, borderline, and' mildly retarded persons (the AAMD-School 
and the ABIC). The primary focus in this paper is 'ith nonbiased assessment which 
is principally. 2 concern about appropriate classification/placement decisions with 
mildly handicapped persons. Adaptive behavior is one of the key areas in the mul- 
tifactored assessment scheme developed by Tucker and mentioned in the PL y4-i^»^ 



! 35 
Table 7 

I 

Measures of | Adaptive Behavior 

1 

Behaviors Assesseo 



RescKly 

r 



* 

Measurement ^, 
• scales 


i 4» 

§ 

1 

ll 

^ i 

si 


Self-directKxi 


ss 

Ii 

38 


Vocational arxl 
occupational skills 


1 




1 

lt 
1 

^1 


i 

1 

f 
52 

1 
1 


c 


1 

c 


1 In the communtty ' , | 


r 

,1 


Popula- 
tion 
type 


1 ^ ^ 

/ 

.purpose 


/ 

Examiner 


Respondent 


Reli- 
at>il(ty 
and 
vaiidi* 
tyof 
data 
avail* 
at)le 


; 

( 

1 

jscores 




/ 

/ 
/ 
/ 

/ 

1 

d 


/ 

a 


£ 

i 


s: 


1 


p 


1 


c 

I 


'j 
G 






1 

t 


1 




1 

or 
® 


1 


1 
1 


AAMO Oinical Version 
<Nit)Jraer«/ iw^ 


X 


X 


X 


X 


^X 


X 


X 








/ 


3- 
aduit 


X 






X 


X 




X 


X 




X 




Yes 1 

1 




X 




45-^ 


AAMD Public <H)yii»l \crsKm 
<Umbefte/a/ 1974) 


X 


X 


X 


X 


X 


X 


X 










7-13 




X 




X 


X 


X 


X 


X 


X 


X 




Yes j 




X 




45-60 


Cawi-Levine 
(Cainefa/ 1963) 












X 


X 










5-13 








X 




X 


X' 


X 




X 




Yes { 

\ 




X 




20 


CiHBrnia 

Pretchoo! * 
(Uvineera/ 1969) 












X 












2-5 


-J 




X 






X 


X 


X 


X 


X 




Yes r 

1 




X 




20 


Camekit 

(Foster^ 1974) Behavioral 
Checklist 


X 








k 


X 

1 


X 










2- 
adult 








X 


X 


X 


X 


X 




X 


X 




Yesj 

f 








60 


Adaptive Beha' 

y^tof Inventory' 

forChildr6n 

JMercer & Lewis, 1978) 








X 


X 


X 




X 


X 


X 


X 


5-11 




X 




X 






X 


X 


• 




X 




YeSj 

f' 




X 


X 


60 


Presctx>oi 
Attitnmeflt 

HecordtDoN. 1966) 


x_ 




k 






X 


X 










txrth- 
7 








X 




X 


X 


X 


X 


X 




No 1 


X 




X 


20 


vmeland scale 
(Oo«. 1966) 




i\ 




X 


X 




X 










txrth- 
25 


X 






X 






X 




X 


X 




Yes 








20 



"With extensive training In ntefvievr'"^: ^ 
Reprinted from Oakland I (.'toldxvater , 1979, p. 147 • 



Rules and Regulations, However, the present level of technology with respect to as- 
sessment of adaptive behavior witu the mildly handi*:apped including the mildly re~ 
;^rded is characterized well in the following quotes. 
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"The. inclusion of adaptive behavior in nonbiased assessment by the 
use of te^^ts or scales to facilitate comparison of a child with 
his/her peers is not yet perfected." (CORRC, p. 20, Ur^dated re- 
port distributed in 1979), 

"Presently, the assessment of adaptive behavior through clinical ^ 
interviews and observations of the child^s behavior in other 
social systems represents the majqt alternatives for pupil ap- 
praisal professionals^ if the goal of assessment is primarily 
placements Until psychometric technology provides a variety of 
suitable and more objective behavior measures, the more informal, 
and thereby subjective, methods will remain in wide use*" (CORCC, 
p. 21, see above) » 

.Problems with assessment of adaptive behavior also were mentioned prominently 
in the AAl© Manual . Grossirin (1977, p». 20-21) empha&ijzed the following problems: 
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1) the flrequent discrepancies in level of adaptive behavior and l^vel of intelli- 
gence with the milxily retarded; 2) the unavailability of adaptive' behavior instru- 
ments that are suffuciently precise to establish a definite cut off score such as 
minus two standard deviations from the population mean; and 3) the major limitations 
with most available instruments such as poor norms and item content selected from 
studies of institutional populations. In view of these limitations, Grossman sug- 
gested that assessment of adaptive behavior must involve a large degree of clinical 
judgment. 

Clearly," the available technology leaves much to be desired with respect to 
assessment of adaptive behavior with normal and mildly retarded children. A con- 
siderable amount of additional work on instrument development and research is 
needed. However, the picture suggested in the quotations above may be a bit too 
negative There has been some instrument development and research in receitt years 
that should be applied to the aissessment of adaptive behavior in classification/ 
placement decisions. Judicious use of the results from these instruments along 
with informal sources of data on adaptive behavior should become a part oi compre- 
hensive evaluation that is conducted prior to classification/placement decisions. 

Review of Adaptive Behavior Measures for the Mildly Retarded 

, AAMD Adaptive Behavior Scale - Public School ( ABS-PS ) . The most important in- 
fluences leading to the development of the ABS-PS were legal requirements in Cal- 
ifornia regarding the classifi^^at ion/placement of students in EMR programs. Other- 
purposes such as providing iuformaticn for educational programs and remediation 
were also cited by the authors' {Lambert, Windmiller, Cole, h Figueroa, 1975.). 
—^-^-^ ' ^ ^ \ - 

The items on the ABS-*^S are a, subset of items from the AAMD Adaptive Behavior 
Scale - Clinical (ABS-C) . The ABS-j-Q was developed from extensive studies of deficit 
behaviors among institutionalized mentally retarded persons.' The purpose of the 
ABS-C was to pinpoint behaviors which ^rever.ted placement of severely retarded per- 
sons in community settings. Once tl;iese behaviors are identified, the focus is then 
on remediation, and eventually, placement in less restrictive settings. The criti- 
cal point is that the items on the ABS-C were selected from studies of severely 
retarded persons for the. purpose of improving program planning/intervention. The 
content of the Public School version is the same as the Clinical version except for 
the deletion of 15 of the original HQ items which were judged to be inappropriate 
for public school students. 

The ABS - Public School is divided into two najor sections., The,,first part 
might be termed adaptive behaviors since high scores on this section indicate higher 
social functioning. The second part might be called malaidaptive behaviors 3ince the 
higher the score, the lower the level of social functioning. The nine domains in- 
volving 56 items on the first part are Independent Functioning, Physical Development, 
Economic Activity, Language Development, Numbers and Time, Voc^ational Activity, Self- 
Direction, Responsibility, and Socialization. A sample item from the Shopping Skills 
area of the Economic Activity Domain is: 



30. Errands (Circle only one) 

Goes to several shops and specifies different items 4 

Goes to one shop and specifies ope item 3 

Goes on errahds for simple purchasing without a note 2 

Goes on errands for simple purchasing with a note 1 

Cannot be sent on errands 0 
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The second part* comprised of 39 maladaptive behavior items has the twelve do- 
mains of Violent and Destructive Behavior, Antisocial Behavior, Rebellious Behav- 
ior, Untrustworthy Behavior, Withdrawal, Stereotyped Behavior and Odd Mannerisms, 
inappropriate Interpersonal Manners, Unacceptable Vocal Habits, Unacceptable or Ec- 
centric Habits,* Hyperactive Tendencies, Psychological Disturbances, and Use of Med- 
ications^ Item 32 of the Hyperactive Tendencies Domain is as follows: 

32 • Has Hyperactive Tendencies 

Occasionally Frequently 
Talks excessively ^ 1 r 2 - 

Will not --sit still for any length of time 1 . ^ 

Constantly runs ^r -lumps around the room 

or hall 12 
Moves or fidgets constantly 1 2 

' Other (Specify ) T 2 

None of the above 

Total 

The child's classroom teaclTer is the recommended respondent for the ABS-PS, 
Respondents are allowed to infer or, if necessary, to guess regarding the child^s 
competencies, particularly those which take place outside of school. 

The norms for the ABS-PS are^based on a Sample of 2600 school age children 'in 
California* Norms cover the ages ^of 7-13, Separate norms are provided by class 
placement (regular vs types of special classes) for Sections I and II of the ABS. 
In addition, separate norms by ethnicity and sex are provided for Section II* 

The interpretation of the ABS-PS is based on comparison of the individual's 
profile of percentile ranks to moSal profiles of children placed in different edu- 
cational programs* No standard 'scores are provided for the domain scores, and no 
overall score for the major sections is available* 

Although the ABS-PS has many limitations, it can be a useful adjunct to clin- 
ical judgment in classification/placement uecisions, and to a lesser degree, in 
program planning/ intervention decisions. The ABS-PS appears to be more appropriate 
for lower functioning children. in the EMR range. The major weaknesses of the in- 
strument are the following: First, the content validity of the items is question- 
able in view of the original purpose of the ABS-Clinical version* The item format 
requires a considerable degree of inference or even guessing. The respondent is 
the teacher who usually has little information about social role performance out- 
side of school* Finally, the method of interpretation, comparing profiles, is highly 
subjective in many cases* 

The Adaptive Behavior Inventory for Children (ABIC ) was developed with the ex- 
plicit purpose of improving classification/placement decisions with the mildly re- 
tarded (Mercer, 1979)* The ABIC reflects a strong social system perspective with 
emphasis on how the child functions in different settings and different social roles* 

The ABIC items were selected on the basis of intensive interviews with mothers 
of children b^.tween the ages of 5 and 11* The item pool^of 480 questions was re- 
duced to 252 questions on the basis of a questionnaire study* These 252 items were 
administered to a standardization sample* Ten items were deleted resulting in 2^*2 
items in the final published version* Most items are age graded* A basal and ceil- 
ing procedure is used in administration of the ABIC* The domains covered by the 
ABIC are Family, Community, Peer Group, Nonacademic School, Earner /Consumer, and 
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Self-Maintenance . 



Sample Items from each of the ABIC domains are provided below^ 



Domain 



family 



Community 



Item 

147. When cannot have what he/she wants immediately, 

how often does he/she get angry and fuss about It? 
--most of the time 

1 sometimes, or 

2 almost never 



142. 



Peer Relations 



144. 



When visiting relatives or friends outside the^ 
neighborhood, does \ usually 

0 goes with an oJjler person 

1 go with children his/her own -age, or 

2 go* alone? ' 

How often does meet and play with his/her friends 

at a special place like a vacant lot, a park, the 
street, the school bus stop, or a courtyard? 



A 



0 



sometimes 



Non Academic /School 



132^. 



\ seldom or never, or 
\)ften ^ " 

?How often does take his/Her school supplies and 

booksjto school without being reminded? 

1 occasionally 
0 seldom, or 

2 r/egularly 



Earner /Consumer 



140. Doaa 



/ 



make correct change for a dollar 



2 /without help 
1 
0 



/only with help, or 
/ not at all? 



Self-Maintenance 



143. 



Dbes order food at a restaurant 

k without help 
jl with some help, or 
/ 0 ^oes someone order for him/her? 



The ABIC is administered as a structured interview. The primary caretaker of 
the child, typically the mbther, is the preferred respondent. For each item the 
mother chooses among thre.4 possible responses. 

^ / 

Standard scores wi^^h a mean of 50 and standard deviation of 15 are provided 
for each domain. The afverage of thse standard scores is used as a composite or 
global index of adaptive behavior.. In addition, three other scores are provided. 
The Veracity Scale at^tempts to detect a "fake good" response set. The "No Oppor- , 
tunity" aal "Not Allowed" responses are seen as an indication of the amount of re- 
striction place on^e child. Finally, the "Don't Know" responses are viewed as an 
indication of the idmount of knowledge the respondent has about the child's activft-ies. 
If critical value^ are exceeded on the three ancillary scales, interpretation of the 
other scores is not recommended. 
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California schools between the ages of 5 and 11, Stratification variables included 
sociocultural group (Anglo, Black, and Hispanic), size of community, and gender. 

The ABIC is the only instrument in which the entire design from item selection^ 
to standardization was directed toward classification/placement decisions with 
normal, borderline, and mildly retarded children. Face validity of the items in 
the domains included on the scale appears to be good, ' The type of derived scores 
are appropriate for classif ication/placement decisions. The ancillary measures ^ 
provide safeguards against interpretation of invalid information. The primary type 
of information provided is related to social role performance outside of school from 
the perspective of the parent (or primary caretaker), ^ 

Although the ABIC is the best instrument published to date \f or assessment of 
adaptive behavior outside' of school with normal or mildly handicapped child-^en, a 
number of weaknesses should w recognized when interpreting scores. The age range 
is limited to 5-11 years. The norms are based entirely on California schc^ol age 
children. The accuracy of these norms in other settings and for other groups is 
questionable (Kazimour & Reschly, in press). An important_doinain.-Gf..,adaptiveJ)e- 
havior for school age children, academic r^le performance, is not included on the 
scale, and is de-emphasized in Mercer's conception of adaptive behavior. Finally, 
practical considerations of time and resources may limit the implementation of this ^ 
method of ass essTng adaptive behavior, 

. The Vine land Social Maturity S cale (VSMS) (Doll, 1953) is one of the oldest 
raea^uf'es of social competence (adaptive behavior) , and continues to be used quxte 
widely' (Coulter^ & Morrow,- 1978) , One of .the rea6ons-.for the^current use of the 
VSMS is that other scales are litsxted in age range or were not available until very 
recently. • ' 

The V§MS is a loosely structured inteirview which requires considerable skill 
on the, part of the examiner. The VSMS attempts to measure social competence over 
the ages of birth to 30 years. As' might be expected the items vary considerably in 
terms of sophistication and ease of administration. The domains of behavior covered 
by the VSMS are: Self-Help General, Self-Help Eating, Self-Help Dressing, Locomo- 
tion, Occupation, Communication, Self-Direction, and Socialization, The VSMS yields 
a composite score which can be transformed to a Social Quotient (SQ) which is a 
ratio of Social Age divided by chronological age and then multiplied by 100, The 
standard deviation of the SQ vaiies considerably from age to age which is generally 
the case with developmental scores such as ratio IQ, grade equivalents, etc. The 
norms for the VSMS are based on rather restricted samples of individuals assessed 
in 1935, 

The VSMS is a venerable instrument which is in rather desperate need of revis- 
ion and renorming, an activity that is currently undeirway which may substantially 
improve th,e scale. For older students it does provide some information that can be 
used to supplement clinical judgment_,.of adaptive behavior. Direct use of SQ scores 
in classification/placement decisions is probably inappropriate for a variety of 
'.'easons (poor norms, limited sample of behavior, etc). 

The Chi^-dren^s Ada:ptive Behavior Scale (CABS) is a recently developed adaptive 
behavior scale which reflects some innovative approaches, Th6 CABS (Richmond & 
Kicklighter, 1980) is administered directly to the child rather thafl^ to a third party 
fespondent. The items on the CABS are organized around the rather typical domains 
of Language Development, Independent Functioning, Family Role Performance, Economic- 
Vocatipnal Activity, and Sdcialization. In contrast to other adaptive behavior 

the CABS appears. to emphasize Jxe cognitive competencies which are required 
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for vaidous ackptive behaviors. For example, on the Independent Functioning Do- 
main of tne items is "Where could you find a doctor?" In the Socialization _ 
Domain one of the items is "What should you say if someone gives you a piece of 
candy?" The norms for the CABS are based on rather restricted samples of slow 
learning and EMR students. 

" Relatively little is known about the CABS. It is likely that considerable 
research will be conducted with this procedure in the future. For the time being 
the CABS should be used cautiously if at all, pending research on its psychometric 
characteristics, 

>» 

Research On Adaptive Behavior 

Relatively little research has been published on the recently developed 
adaptive behavior scales. Three questions concerning adaptive behavior measures 
are particularly relevant to related services personnel. The limited evidence on 
these questions is. reviewed in this section. 

Relationship of Adaptive Behavior and Intelligence A comprehensive review of 
the literature on social competence (the forerunner of adaptive behavior) and intel- 
ligence revealed a great deal of variability among studies (Leland, Shellhaas, 
Nihira, & Foster, 1967). The relationship between social competence and IQ varied 
depending on the measures 'used, the type of subject, and the variability within 
samples. However, in most studies, correlations between social (pmpetence and IQ ^ 
were, in the moderate range, about *4 to .6, These correlations, although substan- 
tial, indicate that social comptenere and intelligence were quite different for a 
sizeable number of persons, ^ 

Relatively few studies of the correlations between IQ and recently developed 
measures of adaptive behavior have appeared in the literature. ^ No studies .were lo- 
cated for the ABS-Clinical or the ABS-PS. The significant differences on the ABS-PS 
betwp.en students in regular and EMR programs suggests that the ABS-PS is probably 
correlated at a statistically significant level with IQ, .IQ scores were ot course 
one of the bases for placing students in the EMR programs. However, these data do 
no't provide information on the size of these correlations. 

Correlations between the ABIC and WISC-R scores have been reported by a nuiabeX 
of authors (Kazimour & Reschly, in press; Mercer, 1979; Oakland, 1980). These cor- ^ 
relations have been in the ^ low range varying from near zero to as high as ,3 with a 
median of about ,15, These correlation^ are considerably lower than those reported 
px:eviously .fpr social competence measure?^ and lower than the correlations reported 
by Mercer (1973, p. 187) for IQ and a forerunner of the ABIC used in the Riverside 
Studies, A number of reasons might account for these lower correlations. The most 
obvious factor is that the ABIC de-emphasijies the cognitive underpinnings of adap- 
tive behavior and abademic or school achievement types of behaviprs are excluded. 
The evidence available to date suggests that the ABIC and measures of intelligence 
are largely independent . 

In contrast to the results for the ABIC, fairly high correlations between 
adaptive behavior and intelligence wei.e reported by the authors of the CABS (Kick- 
lighter, Bailey, & Richmond, -in press). For a sample of mildly retarded and slow 
learning children the correlations were in the range of .4 to .5. These correla- 
tl'^ns would probably be even higher if children from the txxll range of intelligence 
were studied, Tne reason for the higher correlations on the CABS in contrast tc the 
ABIC is probably due to the greater emphasis on the cognitive aspects of adaptive 
behavior* It should be noted that th,e correlations for the CABS are more cor .istent 
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with the results of research conducted with .Kfe older measures of social competence. 

The choice of adaptive behavior measures app^ears to be the major influence on 
^* e relationship of adapi.ive behavior and intelligence. Traditional measures such 
a. ':he VSMS and the more recently developed CABS ate correlated with intelligence 
at a moderate level. The correlation of the ABIC with intelligence is low enough 
that the r^elationship has no practical significance. Vrhe relationship of adaptive 
behavior to intelligence has significance for specification of the meaning of both 
constructs; Are the construccs independent? Are both V subset of a more general 
construct of general developmental level (Lainbert, 1979)V Is one a. subset of the 
other? If so,. which is the more general construct? TheoWtical formulations, re-\. 
search, and instrument^ development in the 1980^s will undo tedly address these 
questions. ^ ' ,\ 

Effects of AdaptjLve Behavior Measurement on Classif ication/Placeiaent . Much of 
the impetus for the "inclusion of adaptive behavior as part of k comprehensive assess- 
ment stemmed from concerns about overrepresentation of minorities in special classes 
for the mildly retarded. Depending on .the adaptive measure used, research indicates 
that adaptive behavior assessment does indeed reduce overrepresentation of minori- 
ties in special classes for the mildly retc»rded (Reschly, in press; Talley, 1979). 
Fisher (1978) reported a high ra.te of "declassification" among all groups., not Just, 
minorities, as a result of the direct use of the ABIC in classification/placement 
decisions. The percentages of students declassified were 60, 70, and 85 for Anglo, , 
Mexican-American, and Black students respectively. Apparently use of the ABIC af- 
fects classification decisions with all groups, not just minorities. The question 
that remains is whether declassification is of benefit to the children involved 

The effects of other adaptive behavior scales on total percentages and group 
percentages of children classified as mildly retarded have not been reported in the 
literature. However, it is obvious that adaptive behavior and intelligence are far 
:l6rom beii,<^ perfectly correlated. If very low scores on both dimensions are required 
for classification, 'then the prevalence of mild mental retardation among school age 
children will undoubtedly be well belovf the popular ^estimates of 2 to 2.5 percent. 
If the IQ cut off is at -2 standard deviations and adaptive behavior is based on 
out of s cnool social role performance, the prevalence of all types of mental retard- 
ation among school age children will likely be closer to 1%, perhaps even lower. 
Assessment of adaptive behavior outside of school will hhve little if any effect on. 
the prevalence of the more severe levels of mental retardation. The prevalence of 
moderate, severe, and profound levels of mental retardation, usually estimated at 
.3 to .5%, i.e., 3 to 5 per thousand, would be unaffected since persons obtaining 
IQs at these levels are nearly* always found to be deficient in adaptive behavior as 
well (Grossman, 1977). The conception of aad measurement procedures used to assess 
adaptive behavior have broad implications for the diagnostic construct of mild men- 
tal retardation. 

Generalizability of Norms. Classification/placement decisions are typically 
made on the basis of degree of need, or the degree of deviation from typical patterns 
of behavior. Such decisions in the area mildly handicapping conditions require 
theL use of norm referenced meajijures* The representativeness and accuracy ^of noxmjs 
for adaptive behavior measures is therefore an important consideration. 

The situation with respect to the quality of the normn for existing adaptive 
behavior scales is not good. Both the ABS-PS and the ABIC use norms based exclus- 
ively on California children. The norms fcr the CABS and the VSMS are similarly 
restricted to persons from a specific geographic area along with other limitations. 
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The accuracy of the ABIC norms for children In other localities has been In- 
vestigated on a limited basis* Buckley and Oakland (1977) studied the accuracy 
cf California norm^ for two samples of Mexican- American children in Texas, The 
Ca.l£fomia mean scores were higher for both samples with a difference as large as 
1/3 standard deviation for one of the samples. A difference of this magnitude 
might very well have implications for classification/placement decisions. Based 
on study of^ three groups (Anglo, Blackj^ and Mexican-Americans) in Texas, Gridley 
and Mas tei^b rook J197.7) again concluded that California norms were inappropriate' 
for Me5cican-/anericans , but acceptabi^e for Anglos and Blacks/^ Kazimour and Reschly 
(in press) also found tjiat the Calif oimia^ means on the -ABIC were higher in a 
study of .four groups (Anglo, Black, Chicano, and Native American Papago) in Pima 
County Arizona, The size of the differences was rather small for all the groups 
except Native American Papagos whose ABIC composite mean was nearly 2/3 of ^ stan- 
dard deviation below the California population average ♦ 

_ The available data suggest caution in use of the norms for adaptive behavior - 
measur^S5>in other areas. The. localities included l:hus far in studies have been^ 
restricted to the southwest. The generializability of these findings to other 
areas is questionable. Even greater caution should be exercised in use of Cali- 
fornia norms with other socioculturaL groups such as Native Americans,^ Southeast 
Asians, Orientals, etc. , 

Unresolved Issues jji Adaptive Behavior 

Although trite, the usual statement concerning the need for more research is 
clearly applicable to the area of adaptive behavior.. A number of pressing issues 
are in need of r^eso.lution. We can only hope that the resolution that must take 
place during the 1980^ s will be-^guided. by ' empirical evidence. 

"Declai^sif led " students . Use of existing adaptive behavior scales, particu- 
larly* the ABIC, may lead to large numbers of students being "declassified," i.e.,, 
not being eligible for special education programs for the mildly retarded. IMany 
otfiers who would be eligible according to traditional criteria might not be placed 
in the future. Serious questions exist concerning whether these bhanges are bene- 
ficial to children. 

To deny or ignore the educational- problems experienced by the declassified • 
children whould^^be naive and inhumane. Declassification in and of Itself is a 
"nonsolution.*'' Studies of the characteristics of children declassified through 
use of the ABIC have produced a fairly complex picture (Eisher, 1979; Scott, 1979). 
About half of the students were regarded as eligible for other special education 
classification. The other half w^?re \aot eligible for existing special eduatlon 
services even though their intellectuj^l and academic performance was well below 
average. Simply returning these Students to regular classrooms, or avoiding spe- 
cial education classification with new referrals, does nothing about the aptitude 
and achievement problems. Special transitional programs, have been funded for de- 
classified students, but these are temporary and unrelated to the piubiems presented 
in new referrals (Meyers, MacMillan, & Yoahida, ^1978; Yoshida, MacMillan, & Meyers, 
_197_6).___The concept of "permanent** transiiiion programs stretches the imagination 
just a bit. The solution of this problem may be In refining the classification and 
in the type of special education services provided (see later section). 

Student Role Performance and Adaptive Behavior . As we have seen the conception 
of adaptive behavior for school age children has been broadened in subsequent rfe- 
visiohs of the AAM& Manual . » We have moved from a point where academic achievement 
was the primary criterion to one current conception which completely ignores academic 
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achievement* (l*e. , the ABIC). Both positions ^re too narrow. Student role per- 
fonuance Including academic achievement should be pairt of the conception of adap- 
tive behavior for school age children (see later, section) • 
< . ■ ' 

Method of Measuring . Adapl^ive Behavior . A traditional "distinction among types 
of tests is the continuum of maximum performance vs typical performance instru- 
ment%^ (Cronbach, 1970) . Most, traditional and current measures of adaptive behavior 
reflect -a mixture of typical and maximum performance kinds of items. With typical 
periormance measures the attempt is to determine how the individual customarily, 
habltjually, or usually performs. The emphasis is not on "can" the individual per- 
form the behavior^ but rather on whether the individual "does" perform the behavior. 
The frequently used response choices on the ABIC of "usually," "occasionally," or . 
"not at all" are good illustrations of this method of measurement. 

A number of special problems exist with typical performance constructs and 
measures. The instruments are often subject to faking or other response sets. 
The ABIC attempts to control for response "sets, but it is likely that the ABIC 
scores are not Completely free of this kind of bias. Other adaptive measures do 
not^control for response set biases. A Siercond problem- is the situatioh specific 
nature of many adaptive behaviors, ^particularly those which involve attitudes, 
social behaviors, or interpersonal competencies. Personality traits generally as 
well as many adaptive i)ehaviors are likely to be exhibited by the individual in 
certain situations but not in others. ,To an unknown degree then an individual's 
adaptive behavior score is due to^ internal motivational and external situational 
contingencies. The degree to which the situation-specific factors in adaptive be- 
havrior measurement are a problem is determined largely by a third concern, the 
kni5wie.4ge base of the respondent. The ideal situation, would be a respondent who 
has opportunity and skills to thoroughly and accural:ely report on the child *s be- 
haviors in a wide, variety of situations. Most respondents, even prlmary_caf etakers 
for^ children, do not have opportunities to.observe children in a ll o f the settings 
and roles included on adaptive behayior scales. The respondent's approach to those 
items where the knowledge base is -incomplete may make a large difference. An ac- 
quiescent response set, independent of "faking good," may lead to spuriously ele- 
vated adaptive behavior scores. The acquiescent response set may operate in the 
following way. Consider an item in which the parent is asked to respond on whether 
the child acts as a he.lper in the classroom. Most parents' knowledge about this 
behavior iSv incomplete and second hand at best. One parent may acquiesce and say 
"Yes, he/^Jre does that sometimes" while another may say that as far as they know 
the child never engages in that behavior. .The problems are the limitations in re- 
spondent knowledge and the different approaches respondents may take tcL_^swerlng 
questions for which their knowledge is incomplete. 

In addition to the problems discussed here, there are a number of other unre- 
solved issues in the area of adaptive behavior. The interested reader is referred 
to Coulter and Morrow (1978) for a discussion of these issues. 

Combining Intelligence and Adaptive Behavior Data 

Jn addition to the other data from the multif actored assessment information 
on adaptive behavior and intelligence is particularly important in classification/ 
placement decisions with the mildly retarded. How adaptive behavior is conceptual- 
ized and measured along with the available special education service options will 
have a significant influence on the classification/placement decisions t'\at are made. 

I suggest that the adaptive behavior dimension for school age children be con- 
cejitualized as two separate components. One component should involve performance 
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in the public school setting with primary empfiasis^n academic achievement In the 
classroom- The other component should be role" performance in social systems out- ^ 
side of the public school such as the home, neighbprhood, and community. Separa- 
tion of the adaptive behavior dimension into two^ components Is.' advisable because 
re ^ntly -published data suggests that adaptive behavior in academic settings and 
^souxal role performance outside of school are largely unrelated for r&any students 
(see previous discussion). 

Inclusion of academic performance in the^ public school in .our conception of^ 
adaptive behavior is consistent with the description of adaptive behavior for school 
age children in the AAMD Manual . Two of tl>e nearly^ universal features of concep- 
tions of adaptive, behavior are age app.ropriate. criteria and cultural context. An- 
alysis of developmental task theory leads to recognition of the Importance of aca- 
demic performance during the ages of about 5 to 16 Or 18 in our culture. Academic 
role performance is an Important cultural^ expectation that is common to all major 
groups in our society. If adaptive behavior is "the way an individual performs 
those tasks expected of someone his (her) age in'his(her) culture" then academic per- 
formance must be included in any comprehensive view of the construct of adaptive be- 
- havior tor school age children. 

Our conception of adaptive behavior should not be restricted to role perform-- 
ance in academic settings. Other social roles^and o-ther social systems are also 
important domains of development. Again the conception of adaptive behavior in the 
AAMD Manual and developmental task theory can be cited as foundations for this sec- 
ond^compon^it^^ During the school age years children perform 

a variety of social roles of increasing complexity in various social syst^\ To 
ignore-^-he child^s strengths and weaknesses Ittr oocial systems outside of the <school 
would salso constitute a 'i^erious deficiency ^ our view of adaptive behavior. 

Classification and placement decisions with the mildly retarded should be based 
on information from both components of adaptive behavior and the dimension of in- 
telligence. Tables 8 and 9^ provide a model for a two dimensional conception of 
adaptive behavior and a scheme for combining information on adaptive behavior in 
classification/placement decisions. r 

The different combinations of adaptive behavior and intelligence have implica- 
tions for classification and placement decisions. Adaptive Behavior-School (AB-S) 
should be based on a complete educational evaluation including observation in the 
classroom, examination of samples of daily work, teacher interview, and the results 
of individually administered standardized achievement tests. Adaptive Behavior-Out- 
side School (AB-OS) should he based on information from formal inventories such as 
the ABIC, where appropriate, or informal dat^ collection procedures. 

Of particular interest are the children who exhibit the pattern of very low 
intelligence, very low AB-S, and normal AB-OS. A major current dilemma is whether 
these children should be classified and placed in special education programs'. Such 
children are almost by definition "Six Hour Retarded Children." If they are classi- 
fied and placed in special education programs we will almost inevitably overrepre- 
, dent minority children. In my view these children should be served in special edu- 
cation programs in most instjances because the:^do ii\ fact have extreme educational 
needs that are typically beyond the scope of regular classroom instruction. _ The 
solution of "delabeling" these children does not address these neeJc. However, the 
segreganed special class for the mildly retarded which has often been the placement 
used, because in many cases it was the only alternative, is an equally inappropriate 
solution. 
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Rationale: 



Table 8 

Conception of Adaptive Behavior for School Age Children 
ADAPTIVE BEHAVIOR ; SCHQOL BASED 

1) Mastery of literacjr skills is a key developmental task for per- 
sons between the ages of 5 and 17 

2) The expectation for and^iemphasis on educational competfencies is 
common to most if not all' major sociocultural groups. 



Assessments ^1) 



Rationale : 1) 



Collection and conslderationVf a broad variety of Information 
Including teacher ^interview, review of cumulative records, exam- 
ination of saiaplea of classroom work, classroom observation, re«» 
suits of group standardized achievement tests, results of indi- 
vidual achievement tests, diagnostic achievement tests, and 
other informal achievement measures. ^ 
' * 

ADAPTIVE BmVIOR r OUTSIDE OF SCHOOL • 

JSastery of a "variety of non-academic competencies also is ex- 
pected, and #key developmental task between the ages of 5 and 17. 



2) The expectations for and opportunities to develop non-academic 
competencies may vary ataong sociocultural groups. 



'Assessment: 1) 



1 



2) 



Collection of information on social role performance outside of 
school in areas such as: peer relations, family relationships, 
degree of independence, responsibilities assumed, economic/voca- 
tional activities, el^ 

Method of collecting data may include formal measures^ interviews 
with parents. Interview with student, etc. ^ 

I U ii n i i i n L i n >. in .i iu„ i iM . . ■ „ ,■ „ . 111 ,i „ > o a .1 . i ■ ' 



The solution to this dilemma depends- on two developments. First, we need a 
more refined classification system which would dlf ferentiatc; between ^hat Mercer 
(1973) called the Quasi and Comprehensively Retarded. According to Msrcei^*w scheme 
the Comprehensively I^etarded are persons who fall both components of the adaptive 
behavior dimension and the intelligence dimension* The Quasi-Retarded exhibit the 
same pattern except for normal social role peirformance outside of school. The over- 
representation of minorities in special education classes for the educable mentally 
retarded in the Riverside, California schools (Mercer, 1973>, and in other locations 
as well, is largely attributable to placement of . Quasi-Retarded children who come 
from minor^lty backgrounds^ Should these children be labeled as mentally retarded? 
bpinioi}s on this issue differ sharply ^Goodman, 1979; Mercer, 1979). 

A refinement In the classification system would be beneficial ^In resolving this 
dilemma. The tjerms Comprehensive aija Quasi are probably objectionable to many as 
is thii. term merital retardation. Use of the terms Educational Retardation, Educa- 
tionally Handicapped , or some other term which* is as behaviprally descriptive as 
possible of the Quasi-Retarded pattern would be preferable. Greater refinement in 
the classification system is useful only if there are implications for placement 
decisions and educational programming* The change suggested may have such implies*;;' 
tions. 1 ' ' , " 
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REFERRAL 



Average 



Adaptive Behavior 
School Based 



Consider other 



Classifications 



Average 



Average 



Significantly 
Subaverage 



Intelligence 
(Academic Aptitude) 



"Quc^si-Retarded'' 
"Educationally Retarded*' 
"Educationally Handicapped*' 
Academic Aptitude Handicap" 



I 



Resource Option in Early 
and Middle Grades 



Significantly 
Subaverage 



Adaptive Behavior 
Outside of School 



Significantly 
Subaverage 



"Comprehensively Retarded" 
Mentally Retarded 
Mentally Disabled 



Special class with inte- 
gration 
Special class 



Classification 



Selects j.on of 
Program Option 



The si-Retarded" do need special services. However, if special education 
services ate to be provided, the objectives should be oriented toward specific aca- 
demic needs rather than broad social competencies-. In most instances the resource 
program involving remedial and compensatory tutorial services is a more appropriate 
option that^ the special class • Special class programs for the mildly retarded have 
traditionally placed con^^iderable emphasis on broadly defined social competencies 
and "functional" academic skills (Ko,Lstoe, 1976). This u-nnhasis is clearly approp- 
rUr^^ for the comprehensively retarded, but is probably misdirected for most of the 
^^.-si-Retarded. With few exceptions the Quasi-^Retarded, if placed in soecial educa 
tionv should b*^ placed in resource programs. 

Use of the resource option for the Quasi-^tarded wo ul^ ^aUevia tetany of tbe 
concerns expressed by Federal District Courts V the pl^ ement miga«on- Tho 

. , - / . v ■• 55 



50 



Reschly 



amount of time spent outside of the educational mainstream is minimized by the re- 
source Option thus reducing the very proper concern about racial segregation. 
Placement in the resource option regardless of classification used may have the 
additional advantage of being less stigmatizing* Analysis cT outcome data must, 
of course, be the ultimate criteria against which this or any other classifica- 
tion/placement system must be validated. 

Refined classification decisions along with selection of service option, re- 
source vs special class, appear to be promising applications of adaptive behavior 
assessment. Other applications of adaptive behavior data with the mildly handi- 
capped are also promising* General strengths and weaknesses across different do - 
mains of behavior may be th^ initial source of information for developing interven- 
tions designed to improve social skills, assertiveness, etc. The information from 
curorently available instruments such as the ABIC is not sufficiently precise for 
direct translation to intervention objectives. ^ a from the ABIC, AAMD-PS, or 
Vlneland can alert us to gener needs .which can then be translated to specific ob- 
jectives through additional observation and/or interview. 

SOCIOCULTURAL BACKGROUND 

The PL 94-1A2 Rules and Regulations list social or cultural background as one 
of the areas that "shall" be considered in placement decisions. The apparent pur- 
pose of the regulation is to ensure that socioeconomic and cultural factors are 
Considered in interpreting information from other sdurces. Consideration of such 
factors was suggested in the placement litigation (e.g,, Guadalupe case) of the 
early 1970 *s wheife in several ins*-ances bilingual children were allegedly misplaced 
in special class programs for the mildly retarded. " Nearly everyone would agree 
that social, econjmic, and cultural background factors should be assessed and con- 
sidered in laspification/placement decisions. In extreme situations, e.g.. South- 
east Asian students who have recently emigrated to the United States, mosd would 
agree that, conventional measures should be interpreted in light of socioculturai 
factors^ and that special education classification/placement decisions should be de~ 
layed until the child has a chance to iearn the language of the school, become la- 
miliar with American culture, etc. Such children may wtill need special services, 
but conventional special education classifications are ixiappropriate. The diffi- 
cult issuer in this area are the consideration and use of. such data with qadive 
born Americans who aid% to varying degrees different from the majority population 
on social and cultural variables. The major questions are how to assess the socio- 
culturai Variables and how this information should be used in classification/place- 
ment AND educational progranyi^ing decisions. As is the case with several of the 
Federal Rules and Regulations, there is no elaboration or guidelines for the meas- 
uremejn' and use of socioculturai information. Tarthermore, in contrast to adaptive 
behavior, there are even fewer resources in psychology or education in terms of 
theory, research,' and instruments that can be applied to the area of socioculturai 
factors. 

The Concept of Eth-class 

The concept of socioculturai background includes the overlapping factors of 
social class and race or ethnicity. Mercer ^"^979) used the term eth-class from so- 
ciology to refer to these combined effects. e concept of eth-class, or the more 
commonly u^ed term of socioculturai backgrouni, is needed to accurately describe the 
variations among groups on measures of intelligence or achievement. 

Socioeconomic Status (SES ) . Social class differences exist within, and unfort- 
unately, between major ethnic-racial g..oups in the United States. These differences 
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are associated with a variety of conditions related to economic resources, educa- 
tional level, attitudes and values, religious and political preferences, etc. In 
short, social* class differences influence the individual's lifestyle and opportun- 
ities. Most pertinent to this chapter are the discontinuities between '*the middle 
class teacher and the every^c4ass child" (McCandless, 1967). The interested reader 
is referred to the excellent discussion by McCandless of the practical significance 
of social class in terras of child development and education. ^ 

Measures of social class vary from relatively simple occupational scales to 
four or five factor indices based on occupation, educational level, source of in- 
come, housing type, and area of residence. In most published research the measure 
of social status is typically based on occupation and educational level of the 
child's parent (s). These two sources of information ^are relatively easy to obtain 
and are closely related to the results of the more thorough measures ofSES. 

That SES i§ related to measured intelligence has been known at least since the 
early years of this century. The relationship is far from perfect. The correla- 
tions between SES and intelligence or achievement are typically in the range of .3 
to .4; the range of performance within. SES levels" is fairly large; and considerable 
overlap of distributions is typical. The relationship of SES to average levels of 
intelligence appears to be more impressive. For example, Kaufman and Doppelt (1976) 
reported mean differences of 9 to 17 points for both bla'cks arid whites between the 
highest and lowest SES groups in the WISC-R standardization sample. Multiple cor-- 
relations in the .30's and .40' s have been reported between WISC-R IQs and race, 
sex, and SES (Reynolds & Gutkin, 1979) with SES being the best predictor of intel- 
lectual level. 

SOMPA Sociocultural Scales . The Sociocultural Scales (SC) in SO^^PA are more 
sophisticated than traditional measures of socioeconomic status. Some information 
on cultural background is also Included. The sfc are based on 22 questions (24 items 
which are organized into nine factors and four sociocultural modalities. The modal- 
ities, factors, and type of information gathered through the SC are presented in 
Table 10. The SC are administered to the 'primary caretaker of the child in an inter 
view that also includes the SOMPA ABIC and SOMPA Health History Inventory. 



Table 10 

\ 

SOMPA Socioculthral Scales 



Modality 



Factor (s) 




Family Size 



Family Size 



ons 



Family Structure 



Parent-Child 
Relationship 
Marital Status 



Relationship of child to parents, gender 
^ of head of household, and marital status 



Soc ioeconomic 
Status 



Occupation 
Source of Income 




Kind of work of ^head of household and 
source of incomeXfor family 



Urban 

Acculturation 



Jense of Efficacy 
Community Par-^ 
ticipation 
Anglicization 
Urbanization 
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The Items on the SC are bas^d on published research concerning factors .rela- 
ted to measured intelligence. The correlations of the factors and modalities dif- 
fe. in size within and between ethnic groups. The multiple correlations becv/een 
the SC and WISC-R Full Scale IQs vary from .37 to .42 depending on group (An^-^lo, 
Black, or Hispanic) (Mercer, 1979, Table 44). The correlation of the four moO.^il - 
ties with the Full Scale IQ score varies for different groups. Socioeconomic status 
has the highest correlation for Anglos (.39) while Urban Acculturation is highest 
for Blacks and Hispanics (.30 and .37 respectively). The Family Structure Modalitv 
has. relatively low correlations with the WISC-R for Anglos and Blacks (.13 and .15 
respectively) and is, not correlated with any of the TISC-R scores for Hispanics. 



Mercer's argument for pluralistic norms (see below) was bolstered by the data 
on the relationship of the SC to the WISC-R, She suggested three criteria indicat- 
ing the need for pluralistic norms: 1) Significant differences among groups on 



measures of intelligence; 2) Significant differences among groups in sociocultural 
measures; an?' 3)^ Sociocultural measures accpunt for a significant amount of the 
variation in measured intelligence within and between groups. These criteria were 
met in her studies of California school age children representing three groups , 
(Anglo, Black, and Hispanic). The subsequent development of pluraiisti:: normf has 
become the most controversial aspect of SOMPA. 

SOMPA Estimated Learning Potential . The SOMPA Estimated Learning Potential 
(ELP) procedure is the formal method developed by Mercer to eliminate the biases 
in IQ tests. A multiple regression equation using the Sociocultural Scales (SC) 
as predictors and the WISC-R IQ scales as criteria were developed for the three 
groups in the California standardization sample. Separate regression equations 
are used for each group* Although seemingly complex, the entire procedure simply 
involves changing the WISC-R mean and standard deviation to 100 and 15 respectively 
for all groups, and then computing individual scores through differential weighting 
of the four SC modalities. The amount of change for any individual within each of 
the groups depends on his/her sociocultural characteristics. The net effect is to 
remove group differences through an algebraic transformation. The question now, 
as posed by one of the commentators on SOMPA, is ."The Algebra Works - But What 
Does It Mean?" (Brown, 1979). 



Before reviewing research and commentaries on the ELP it perhaps 'is important 
to recognize that Mercer's 6LP procedure is not the first time that someone sug- 
gested changing IQs depending on the child's background. Piatt and Bardon (1967) 
quoted Havighurst (1951) as recommending "A good rule to follow is to add ten 
points to the IQ of all children who come from underprivileged homes or homes 
where English is not spoken as the first language." The. reasons for this sort, of 
adjustment of scores have never been altogether clear. They seem to be related to 
myths that IQs should represent innate ability. That myth continues to haunt our 
efforts and has been a part of the SOMPA debate. The ELP does represent a much 
more systematic and logical method for adjusting scores. 

Meaning of ELP scores . One of the obvious questions about use of the SOMPA 
ELP scores is whether the California norms can be generalized to other areas and 
to other sociocultural groups, e.g.. Native Americans* The SC data anc ELP re- 
gression equations provided by SOMPA are b^ed entirely on the SO^IPA Standardiza- 
tion sample of California school age children. The California normative data will not 
be accurate for other areas or groups if there are significant differences in mean 
WISC-R scores, in sociocultural characteristics, or in the relationship of the SC 
to WISC-R scores. Mercer (1979) suggests that local norms be developed through 
studies of representative samples of normal children. The minimum sample size rec- 
ommended is 25 males and 25 females at each of 7 age levels, or a total sample of 
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350 students (Mercer, 1979, p, 144), Data to be collected includes the WISC~R and 
the SC, In addition, fairly complex statistical analysis are required. Due to 
limitations in resources it is highly unlikely that studies meeting t^ese criteria 
will be conducted ±xi very many localities or for very many groups. 

Studies on the generalizability of ELP scores have been conducted with data 
sets from Pima County, Arizona axx^ Austin, Texas (Reschly, 1980; Oakland, 1980), 
The limited data available now suggests caution in using the California ELP no'rms 
in other areas or with other groups. The studies that have been conducted Involved 
samples from the Southwest, These populations are likely to be more like California 
samples than say samples of Hispanic children from New York City or samples of black 
children from South Caroling. 

A second major concern has to do with the possible effects of the use of ELP 
scores in classification/placement decisions, A number of authors have expressed 
strong reservations about using ELP scores in classif ication/placement decisions 
(e,g,, Clafizlo, 1979), However, this is the primary purpose of the ELP concept, 
and these scores will tjndoubtedly be used by some in classification/placement de- 
cisions. 

For all children, the ELP score is either the same as or higher than the con- 
ventional WISC-R IQs, The ^gnitude of these differences sometimes is quite large. 
Direct use of the ELP score will therefore have -implications for children classified 
as mildly retarded. Fisher (1978) reported that 40 to 75% of the children currently 
classified as EMR woul' be "declassified" if the ELP score was used rather than the 
conventional score, 1 i greatest effect was on minority students with little change 
noted for Anglos, The obvious question that remains is whether the possible "de- 
classification" due to use of ELP will be beneficial to students, 

A third issue has to do with the validity of the ELP scores. Mercer (1979) 
argues strongly that the validity of ELP must be assessed in the context of the 
values and purposes of her Pluralistic Assessment Model. From that perspective, 
the ELP Is valid in that the differences between sociocu? tural gvoups aie eliminated, 
and variations within groups are preserved. The broader ptr5:pective adopted by crit- 
ics is that ELP in order to be useful must relate to other criteria. The ques^:ion 
is, which criterion? The relationship of ELP to achievement test scoxes or teacher 
ratings of achivement is not as strong as the relationship of the conventional scores 
(Oakland, 1980; Reschly, 1978), Mercer rejects these data as irrelevant to the con- 
struct. of ELP, In her view the key is not the relationship of ELP to past achieve- 
ment, but rather the degree to which ELP would predict acquisition of new material, 
or learning ^ate. The technology available to assess learning potential or learning 
rate is not well developed, or easily applied. This kind of study, i,e,, relating 
ELP to learning rate, is needed in order to establish the predictive validity of 
ELP as well as clarify a number of fascinating theoretical issues. 

One of the most intense debates concerning SOMPA is over the issue of separat- 
ing ignorance from stupidity (Goodman, 1979; Mercer, 1979), One of the purposes of 
the ELP is to determine whether the child is "stupid" or merely "ignorant," This 
clearly borders on luc miction of attempting to separate innate potential from cur- 
rent level of functioning, or true mental retardation from pseudo-mental retarda- 
tion. Children with low conventional scores and high ELPs are presumed to be ig- 
norant while those who score low on both might be regarded as stupid (or "truly" 
retarded). This argument probably beyond resolution thtough empirical study, 
although data on the ELP learning rate relationship would be interesting in this 
context. The broader issues in the ignorance-stupidity debate are the meaning of 
mild mental retardation and the meaning of IQ test scores. 
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Although much of the discussion. of ELP in this chapter has been skeptical, 
the ELP concept may be highly useful in one important area, clarifying the mean- 
ing of IQ te3t results. In SOMPA Mercer renames the conventional W-ISC-R scores 
as School Functioning Level (SFL). Although I would prefer the term^academic ap~ 
ti'ude, the result is virtually the same. Renaming what the tests measure may re- 
duce misconceptions about IQ test results. 

PRIMARY LANGUAGE 

The assessment of primary language competence is a logical, common sense pro- 
cedure as well as a requirement in the recent legislation, Non-English speaking 
children have apparently been placed in programs for the mildly retarded on the 
basisj of tests administered in English (see Diana or duadalupe cases). These clas- 
sification and programming decisions were inappropriate, although an even larger 
problem in those situations was the apparent absence of alternative programs for 
non-English speaking youth, ' * 

Assessment of primary language competence is more difficult than it might 
appear. Many instruments have been developed recently (see Oakland, 1977,, but, 
little systematic work has been conducted on their reliability and validity. 
Nevertheless, systematic effort ta assess primary language competence Is needed. 
The decision about primary language competence must be based on data. The pres- 
ence of a Latino surname, for examf>le, is certainly not sufficient to conclude 
that the child or family uses Spanish as the dominant language. The author is 
acquainted with cases of Latino surnamed families where Spanish is not Spoken, and 
has not been used in the family for several generations. Conversely, the author 
encountered a case in 1967 in eastern Iowa where the child had an Anglo surname, 
but was monolingual Spanish speaking. 

The information on primary language is important in collecting and interpret- 
ing other assessment data, and in decisions about appropriate interventions. If 
the child is monolingual, non-English speaking, perhaps the wisest course of action 
is to simply avoid the use of nom referenced standardized tests of achievement and 
ability. The 94-142 regulations suggest use of an interpreter. Due to the many 
problems which arise when attempts are made to translate tests into other languages, 
e.g., items do not have the same me^aing and difficulties of items change, the re- 
sults of translated tests are of questionable value. If^ inferences must be made 
about 'ability, use of nonverbal or performance tests is probably the best course 
of action. Educational programs for monolingual non-English speaking students 
should be provided in the students^ native language if at all feasible. If only 
a few monolingual children attend' schools in a particular district, then other 
alternatives should be pursued (see Oakland, 1977). , 

Bilingual children may exhibit widely varying competencies in English and an- 
other language. The range will extend from limited to high degrees of competence 
in either or both languages. The language dominance measure that is used to deter- 
mine primar> language should be supplemented by other measures vjhich yield informa- 
tion on competence in both languages. Subsequent assessment activities should be 
conducted witin the dominant language of the child. An important principle to re- 
member is the assumption of maximum performance. Any inference about ability or 
academic aptitude made in subsequent assessment activities should include consider- 
ation of the effects oi differences in language. Bilingual youth may, though cer- 
tainly not always, oluain lower scores on verbal measures administered in English 
due to limited exposure to English. Special education services may not be the ap- 
propriate intervention for bilingual children who, on the basis of other data, meet 
the state guidelines for special education classification. Bilingual/bicultural 
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programs may be more appropriate, and children's rights to such services have been 
established through recent litigation. 

MILD MENTAL RETARDAIluN: "A CONTINUING DILEMMA" 

Much of the professional debate, litigation, and legislation over bias in as- 
sessment involves implicit and .contradictory assumptions about the nature of mental 
retardation. The meaning of mild mental retardation, called *'a continuing dilemma" 
by Zigler (1967), has been a particular problem. Consensus regarding the meaning 
of this diagnostic construct would greatly assist efforts to resolve th^ issues 
discussed in this paper. 

Definitions and Classification Criteria 

Tecminology and classification criteria in mental retardation have evolved 
throughou*- the present century. There are two major sources ot terminology and 
classification criteria which are crucial ior diagnostic personnel in the public 
schools. The American Association on Mental Deficiency (AAMD) , the major profes- 
sional organization in the field, publishes a terminology an 3 clas>sif ication manual 
The AAMD Manual is revised periodically with the most recent levisions published in 
19777T97571Ka 1961 (Grossman, 1973; 1977 and Heber, 196]). The 1973 and 1977 re- 
visions are virtually identical. The AAI^ Manual on ' Terminology and Classification 
has a significant influence on other definitions and classification criter;ia in 
mental retardation. The influence of 1961 and 1973 versions are to varying degrees 
reflected in state education codes, the second major source of guidelines for term- 
sinology and classification in mental retardation. State education codes usually 
provide a definition and classification criteria for mental retardation which are 
to be applied by public school diagnostic personnel. Although the AAI^ system is 
important, it should be noted that decisions in the public schools are to be based 
on the state definition and criteria for mental retardation. The terminology, 
classification criteria, etc., for mental retardation vary considerably among 
states (Patrick & Reschly, 1980). Knowledge of your current state code, usually 
published in the State Special Education Rules and Regulations, is a necessity for 
diagnostic personnel. 

MacMillan (1977) and Robinson and Robinson (1976) provide thorough analyses 
of the AAMD classification system in mental retardation. Some of the most import- 
ant characteristics of the AAMD system are the following. 

1. Bi-Dimensional . The individual must exhibit deficits in both 
intelligence and adaptive behavior in order for the classifi- 
cation of mental retardation to be appropriate. 

2. Developmental . The deficits in intelligence and adaptive be- 
havior must appear during the developmental period which is 
defined as the ages of birth to 18. 

3. Current ^Status . "Mental retardation is descriptive of current 
^ behavior and does not necessarily imply prognosis" (Grossman, 

1977, p. 11). 

\ 

4. Etiology. Etiology of mental retardation is not specified in 
the definition. Etiology may be due to psychosocial, psycho- 
genic, or biological influences. 

5. Continuum . All types and levels of mental retardation are implic- 
itly organized on the same continuum ranging from mild to profound. 
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6: Levels . The level of severity o£ mental retardation is 
specified by standard deviation (s.d.) cut off points for 
IQ'^test scores, 

/ 

Mild (Educable) - 2 s,d. IQ Range of about 55-69 
Moderate (Trainable)- 3 s,d, IQ Range of about 40-54 
Severe - 4 s.d. ' IQ Rauge of about 25-39 

N , ^ Profound - 5 s,d* ' IQ Range of about 24 and below T^'f \ 

J 

7. Adaptive Behavior , The cfiter.ia for adaptive behavior de- 
pends on the age of the person. (See earlier discussion •) 

The AAMD classification scheme depicts mental retardation as a current status 
with no Implications for etiology or prognosis. Further, the notion of mental re- 
tardation as situational (vs comprehensive) incompetence is at -least implied in 
the different criteria for adaptive behavior depending on the age of the individual. 
Clearly, the AAMD class'if ication scheme does not require that mental retardation be 
a permanent status, or be due to biological ancxnaly. 

Much of the litigation as wfell as other discussions of bias in assessment re-; 
fleet the implicit misconcf^ption that mental retardation requires pernianent incomp- 
etence, comprehensive incompetence, and biological .anomaly. Mercer *s report on the 
Riverside studies (Mercer, 1973), the Larry P. Opinion, and the concern for "six 
hour" retarded children are examples of these misconceptions, "Six hour" retarded 
children were described in a 1970 President's Committee on Mental Retardation report 
as retarded only in the public school context, thus the adjective "six hour," In 
other social settings they were described as coping in ways that ",,,may be excep- 
tiomlly adaptive to the situation and community in which they live," Should .these 
students, who are failing in the classroom, have low intelligence and achievement 
scores; etc., be classified- as mentally retarded? 

The answer to this question obviously varies according to what the diagnostic 
construct of mental retardation means. The AAMD Manual and most state education 
codes would allow classification in the mild or educable level of mental retardation. 
Whether such children are "truly" retarded, i.e,, permanent and comprehensive im-^ 
pairment due to biological anomaly, is largely irrelevant in these classification 
systems. They may be classified as mildly retarded on the basis of serious problems 
in the classroom, low intelligence and achievement scores, etc. However, from , 
Mercer's perspective as well as that of the courts and the Federal Office for C'ivil 
Rights, these children are not "truly" retarded. 

The current debate over true ys pseudo or quasi retardation is reminiscent of 
the earlier discussion of pseudofeeblemindness (Benton, 1956). The 1961 and sub- 
sequent revisions of the AAMD Manual represented attempts to avoid the issues of 
precise etiology (which usually is unknown) and prognosis (which often is unclear). 
However, there is an implicit problem in the AAMD system which contributes to the 
confusion over the meaning of mental retardation. The mild (or educable) level of 
mental retardation is markedly different from the moderate, severe, and profo^lnd 
levels on a ntimber of dimensions. Among these dimensions aire: 



1, Etiology , The presumed etiology for most cases of mild mental 
retardation is the AAMD category of psychosocial. The vast 
majority of the mildly retarded do not exhibit any evidence 
of biological anomaly. In contrast, the more severely retarded 
almost always have biological anomaly, although the precise et- 
iological mechanism often is unknown, ^ 
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2. Age. Mild mental retardation is rarely diagnosed prior to 
5 or 6, and the highest prevalence is usually found in late 

> childhood or early adolescence. In contrast, most cases of 
moderate, severe, or profound' mental retardation are diag- 
nosed during the preschool years, usually prior to age 2. 

3. ' Comprehensive . The behavioral deficits of children who ar^ 

classified as mildly retarded are usually restricted to the 
public school settingv Performance "In other settings is 
^ usually regarded as norioal by significant others such as 

parents, siblings, and other adults. The moderately, se- 
verely, and profoundly retarded are typically regarded as 
retarded in nearly all settings. 

4. Socioeconomic Sta tus (SES ) . There is a strong as^sociation 
between socioeconomic status and mild mental retardation. 
Children who are diagnosed ^s mildly retarded are much more , 
likely to come from low SES environments. The relation- 
ship between SES and other levels of mental retardation 

is very 'weak, if it exists at all.^ 

5. Ethnic/Racial Status. ,, The prevalence of mild mental re- 
I ' ^ tardation is higher among specific ethnic/racial groups 

if the group is also of lower SES. The more severe levels 
of mental retardation ar_e_npt found with^any greater fre- 
quency among specific ethnic/racial groups. 

6. Permanence « Most of the persons 'diagnosed as mildly re- 
tarded become independent functioning, self-supporting 
adults (Bailer, Charles, & Miller, 1967).. The diagnosis 
of mild mental retardation is therefore not permanent for 

^ these. individuals .since their adaptive behavior during 
adulthood is within normal limits. Again In contrast, 
nearly all of the more severely retarded ave unable to 
function with complete independence or become entirely 
self-supporting at any time during their life span- At 
the more severe levels (moderate, severe, and profound) 
mental retardation is, almost: without exception, a perm- 
anent condition. 

The implicit problem in the AAMD system is that all levels of mental retarda- 
tion are placed on the same continuum despite the differences cited above in the 
mild vs the more severe levels. The adjectives mild, moderate, etc., have not been 
effective in communicating these differences. Another way of analyzing this prob- 
lem pLs to consider the connotative and denotative meanings of mental retardation. 
The denotative, or precise scientific, meaning of mental retardation is restricted 
to current status with no assumptions about etiology or prognosis. However, the 
connotative (everyday, lay public) meaning of mental retardation is that of compre- 
hensive incompetence and permanent disability due to biological anomaly. Changes 
in terminology and conceptions of diagnostic constructs are indicated when the as- 
sociated connotative and denotative meanings are widely divergent. This appears to 
be the case with mental retardation. 

Revision in the tno.ntal retardation classification system in the form of clear 
separation of mild frc;A other levels of mental retardation would aid in solving 
this problem. Terms such as educationally retarded or academically handicapped 
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would be more appropriate descriptors of the kinds of problems displayed by students 
classified now as mildly retarded. Other terms might be better than the examples 
used here. The point is to reduce the miscommunication and misconception about men- 
tal "retardation. Revisions in the classification system would assist in resolving 
some of the issues discussed in this p^per. Such revisions, however, are not pana- 
ceas^ ' 

NONBlASED ASSESSMENT: SOME TENTATIVE CONCLUSIONS 

Npnbiased assessment is obviously an extremely complex issue. Concerns with 
the meaning ^nd usefulness of IQ test results have dominated much of the discussion 
of nonbiased assessment. The issues surrounding the meaning -of IQ (academic apti- 
tude) have been debated for at least sixty years, and are not likely to be resolved 
in the near future. However, many other issues such as the meaniag and etiology of 
mild mental retardation, the rights of parents and students, the effectiveness of 
special education interventions, and the definition or bias in tests are clearly 
involved with our efforts to reduce bias in assessment. These issues have been ^ 
discussed in this paper, though certainly not resolved. 

There are ^o possible reactions among a range of possible reactions to the 
pressures for nonbiased assessment which could be damaging to children. One pos- 
^ sible reaction is to conclude .that the issue is so complex afid ill defined that 
there is nothing we can do., hence, we should stubbornly^nJgfend and simply continue 
our currenX practices. This reaction will be maladaptive. There are important 
changes that we can make which will enhance the fairness and usefulness of assess- 
ment for all children. In the interests of children, we need to make these changes. 
A second maladaptive reaction is to reject most if not all of our current instru- 
ments and practices. For example, some have rejected the use of IQ tests with cul- 
turally different children. Others have severly limited .the numbers of culturally 
different children in special education programs simply on the basis of their pro- 
portions in tlie population. Such reactions are not in the best interests of children. 

Positive reactions to the concerns about nonbiased assessment must first be 
based on a recognition of the ambiguity of the current situation. There are no and 
probably never will be any easy solutions. 

. ) 

Recognition of the underlying assumptions in the special education placement 
litigation provides an orientation to the most important issues in nonbiased assess- 
ment. One can only wonder if these cases would have appeared IF the interventions 
were effective; IF due process safeguards had been observed; IF the interventions 
had been consistent with the principle of least restrictive alternative, i.e., had 
not been provided in segregated, self-contained specia:. classes; IF the assessment 
had been multifactored and programs based on specific educational need; and so on. 
The fact is that assessment and programs did NOT meet these criteria in at least 
some, and perhapsy, many instances. The litigation and legislation are attempts to 
correct these abuses. From the perspective of related services personnel and spe- 
cial educators, the current demands for nonbiased assessment along with the other 
requirements from the courts and legislation, are the best things that have happened 
for our professions (and for children). ^ 

Three general themes should form the basis for efforts to achieve nonbiased 
assessment. First, and most important, we must continue and expand our efforts to 
insure that assessment procedures result in positive benefits for individuals. 
This goal is certainly not new. The underlying assumption of positive benefit to 
individuals has always been the goal in all types of assessment. Realization of 
this goal requires more concern about the relationship of our assessment activities 
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to Intervent^ions, and more concern about the effectiveness of these interventions, 

A second theme is the need to implement the idea of multifactored assessment. 
Again, this is not a .new idea. However, the degree to which comprehensive assess- 
ment V7as conducted,, documented, and used in planning intervention? has varied con- 
siderably. The proper role of IQ tests in the multifactored assessment must be 
recognized. Areas often ignored in the past, ^.g., adaptive behavior outside of 
school, primary language competence, and sociocultural background, should be a part 
of the assessment process. These newer areas of assessment, along with the conven- 
tional areas, are important to better understanding of children. Fuller understand- 
ing can lead to better, more refined classification decisions and more effective 
interventions. 

Finally, our understanding of nonbiased assessment and ciir ability .to imple- 
ment these procedUiTes will be enhanced if^we view nonbiased assessment as a process 
rather than a set of instruments. The process is oriented towar<k insuring fairness 
and effectiveness of assessment and interventions for all children. The process, is 
appropriate in all settings rega^^lP^s of fehe ethnic or racial composition of the 
student population. The nonbiased assessment process is perhaps best illustrated by 
the series of questions developed by the Northeasn Regional Resource Center. A copy 
of this document is included in an appendix. A second guideline, which follows was , 
developed by a committee appointed by the Iowa Department of Public Instruction. 
Both documents are attempts to identify key features of a, nonbiased (and effective) 
assessment process. 
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PROTECTION IN EVALUATION PROCEDUPES PROVISION OF PL 3k-]^2 



PXOTICtlOK IN EVALUATIOK FXOC10UE£S 

<a) Eadi State educational agency 
shall injure that each pi^)Ilo Asexxcy 
establishes-and Implementa proceduree 
which meet tbe requirements of if 

(b) Teitlng an'd evaluation materials v 
and procedures used for the purposes <^ 
evaluation and placement of handi« 
capped children most be selected a'^i 
administered so ss not to be ^,z:Uii7 or 
ciiltiurally dlscrlxcloaiory* 

(fld U^.C. 1412 (5) (0>.) 

§ 121s.531 Freplacetneitt evahmtlon. 

. . Before any action is taSwn with re- 
i;pect to'the initial placement ot a handi- 
capped child in a special education pso« 
gram* a full wd Individual evaluation 
of the child's educational needp must be 
conducted in acconUnce with the re* 
quirexnents of {lau^dlL ^ , 

1412(6) <0).) 
§121«.532 EvaEuation procedures. 

8Ute and local educational agencies 
shall insure, at a mininfutm. tlM^: 

<a) Tests and other evaluation 
materials: 

(1) Are provided and administered m 
tbe child's native language or other mode 
of communication; unless it is cleitrly not^ 
feasible to do so; 

(2) Have been validated for the spe- 
cific purpose for whidi they are used; 
and, 

<3) Are administered by trained per- 
sonnel in c<Miformance witii the toistruc- 
tlons provided by their producer; 

<b) Tests and other evaluation materi- 
als include thoee tailored to assess spe- 
cific areas of educatScmal need and not 
merely those which are deslgneif to pro- 
vide a single general intelligence quo- 
tient; 

(c) Tests are selected and adminis*' 
tered so as best to ensure that when a 
test is administered to a child with im- 
paired sensory, manual or speaking 
skllls7~the test results accurately re- 
flect the child's aptitude or acMevement 
level or_whatever other factors the test 
purports tomeasure, rather than reflect- 
ing the^chUd's impaired sensory, manual, 
or speaking slclBs (excep^ where those 

skills are the factors which the test pur- 
ports to measure) ; 



(d) No single procedure is used as the 
sole criterion for determining an appro- 
. priate educational proeoram for a child; 
and 

.(e) Ihe evaluation is made by a miU- 
t'disciplinary team or group of persons, 
taciudi^ at least one teacher or other 
specialist with knowledge in the area of 
3uspected dlsabihty. 

(f> '7he child is assessed in all arecis 
related to the suspected disability, in- 
cluding, where ai^pvopriate, health, vi- 
sion, hearing, social and emotional 
status, general Inltiligen^ academic 
performance, communlcative^tatus, and 
motor abilities. 

(30 IT3.a 1412(6) (C).) 

ComimnU GbU4rtii who hsve • speech Un- 
pAiimuit «s their priiaary 2uuidl';sp mAy pot 
need acocnplate battery of saee58ment«.(e.g., 
payohologicel, ph^nrtesl, or sdl^>tiy^ be- 
havior) . Sbfwemr. a qvsUfled speech-language 
patliatoiist w»al<S (1) evsluate esch speech 
impaired chUd utltig procedures that sre im- 
propriate for the dtagnosts snd »pi^nluX of 
speech and language disorders, and (2) where 
neceaaary, maka referrals for additional as- 
^ eesaments needed to nialce an appropriate 
placement declalon. 

§ 121a.533 Placement pvocedctres. 

(a) Xa inkrpretiiig evatoattai data 
and In jm^^ixig placement (decisions, each 
public agency j^iall: 

(1) Draw upon information from a va- - 
ricty of sources, induding aptitude and 
achievement tests, teacher recommenda- 
tions,^ physical co n di*^*on, social or cul- 
tural bacVground, and adaptive behavior; 

(2) Insure that information obtained 
from yi of these sources is documented 
and carefully considered; 

(3) Insure that the placement decision 
is made by a group of per^ns. including 
persons knowledgeable abo^^t the child, 
the meaning of tb9 evaluation data, and 
the placement options; and 

(4) Insure that the placement deci- 
sion is made in conformity with the least 
restrictive environment rules in 5§ 121a.- 
550-121a^54. 



(b) If a determination Is made that 
a child is handicapped and needs special 
education and related services, an indi- 
vidualized education program must be 
developed for the child in accordance 
with $9 13la.340-l2ia.349 of Subpart C. 

(20U.S.C. 1412(6)(C); 1414(a)(6),) 

-^comment^ FaragrapH (a)(1) Includes a 
llat of examples of nourcea that ma/ be tised 
hy a public igency u. maUng placement de- 
cisions* TOeagener would not have to use all 
the aouroea to every int tance. The point of 
the requirement U to insure that more than 
one source Is uae4 in loterpreUng evaluation 
data and In. making placement declalone. For 
example, whUe an of the named sources would 
have to he used for a child vhoae suspected 
disability Is aieiieal retardation, ^ther would 
not be neceeeerjr for certain other handle^- 
ped children, such as a child who has a se- 
vere actkmliMttoix disorder as bis prtmarr 
handle^* For such a chUd, the Q;>eech»lan- 
guage patholeilstr In complying with the 
multlwwfon ntalseoMcit, might use (I) a 
8tandardlMdtee^oCa»tkmUtlDn«CBd(a) ob- 
servation of the chUd's articulation behavior 
In conversatioiul cpeech. 

§ 121«.5S4 ReevflluatkHt. 

Each State and local educational 
agency shaU izuuie: 

(a) That each handicapped chUd's In- 
dividualized education program is re- 
viewed in accordance with i21a.340 — 
121&.Z4» of Subpart C, and 

(b) That an evaluation of the cliild, 
based on procedures which meet the re^ 
Quirements under § 121a.532, is con- 
ducted every three years or more fre- 
quently if conditions warrant or if the 
child's parent or teacher requests an 
evaluation. 

(20 n.S.C. 1412(5) (c).) 



From p. 42^96 S 42^97 of Federal Register , August 23, 1977. Education of Handicapped 
Children, Regulations Implementing Education for Al f Handicapped Children Act of 1975* 
(Public Law 9^-1^2). 
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APPENDIX I 

OUTLINE OF NONBIASED ASSESSMENT PROCEDURES 
DEVEI.OPED BY THE NORTHEAST REG10NAL.RES0URCE CENTER 

REFERRAL 

1. Are the parents/guardians aware that a referral has been made for their child, 
and by whom? ^ 

2. Is \this child's presenting problem clearly and precisely stated on the referral? 

a. Does the referral' include descriptive examples of behavior rather than 
opinions of the referrihg^agent?'^ 

b. Is there supportive documentation of the problem? 

3. Is the referral legitimate? 

a. Does the referring ^gent have a history of over referral of children from 

certain cultural groups? 
b* Ccup.d irrelevant personal characteristics (e.g., sex or attractiveness) 

of the child have '^influenced, the decision to refer him? 

c. Could the referring agent have misinterpreted this child's actions or ex- 
pression due to his -lack of, understanding of cultural differences between 
himself and the child? 

4. Can the Assessment team provide the referring agent with interim recommendations 
that may eliminate the need fr^ a comprehensive evaluation? 

a. Is it poasible that the curriculum being'used assumes that this child has 
developed readiness skills at home that in -reality he hasn't had the oppor- 
tunity to develop? If so, can the team assist the teacher In planning a 
program ^o give this child the opportunity to. develop readiness skills? 

b. Can the team provide information on the child's cultural background for the 
referring agent so that there are fewer misunderstandings between the refer- 
ring agent and this child and perhaps other children of similar cultural 
background? 

5. Have I informed^thia^ild's parents/guardians in their primary language of the 
referral? ^ 

a. Have I explained the reason (s) for the referral? 

b. Have I discussed with the parents what next step activities may be involved? 
e.g., - professional evaluations 

- use of collected data 

- design of an individualized educational plan, if necessary 

c. Have I discussed due process procedures with the parents? 

d. Do I have documented parental permission for the evaluation? 

e. Have I asked the parents to actively participate in all phases of the assess- 
ment process? 

f . Have I informed the parents of their right to examine all relevant records 
in regard to the identification, evaluation and educational plan of their • 
child? 

MEETING THE CHILD 

1. What special conditions about this child do I. need to consider^ 

a. What ±$ the child's primary home language? 

b. Do I know about the child's home environmental factors? 
e.^. , - familial relationships/placement 

- social and cultural cuf ^oms ^ 
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c. Do I understand this child's cultural and language su vT-xt I can evoke a 
'Vievel of performance which accurately indicates the ^^I'iL'^.^s underlying 

competencies? 

d. Is this child iaipeded by a handicap other than the referral problem fihat 
may result in his not understanding what I am talking about? U - 

2. What special conditions about myself do I need to consider? 

a. How do I feel about this child? i 

b. Are my values different from this child* s? 

c. W^^ll my attitude unfairly affett this child's performance? 

' d. Can I evaluate this child fairly and without prejudice? , ' 

e* If not, would I refer him to another assessor if one is available? 

/ 

3. Have I examined closely all the available existing information and fought ad- 
ditional information cou,.jming this child? j 

a. Has the child's academic performance been consistent from year to year? 

b. Is there evidence, in this child's record that his performance wks negatively 
or positively affected by his classroom placement or, teacher? | 

c* Are his past test scores consistent with his past class performance? 
• d. Am I familiar with past test instruments used to evaluate thislchild and 
hc^w well can I rely on his prior test scores? | 

e. Have I observed this child in as many environments as possible/ (individual, 
large group, small group, play, home)? J 

' f. Am I making illegitimate assumptions about tHls child? e.g.,,po i. assume 
he speaks and reads Spanish simply because he is Puerto Ricanf 
g. Have I actively sought additional information on non-school related variables 
that may have affected this child's school performance? 
e^g,, - health factors (adequate sleep, focri) 

- family difficulties 

- peer group pressures ' >^ ^ ^ 

4. Does this child understand why he is^invthe assessment situatfbn? 

* a. Have 1 tried to explain at his level of understanding whasa^th^ reasons were 

for his referral? ' 

b. Have I given this child the opportunity to freely express his perceptions 

of "the problem"? ^ . . 

c» Have I discussed winh the child what next step activities may be involved? 

SELECTION OF APPROAC H FOR ASSESGHENT 

1, Have I considered what the best assessment approach is for this chiM? 

a. Considering the reasons for referral, do I need to utilize behavioral obser-^ 
vat ions, interviews, informal techniques or standardized techniques or a 
combination of the above? --^ — ' ^'"^ ^ 

b. Have I given as much thought to assessing this child 's^daptive behavior as 
I have to his academic school performance? 

c. Are the approaches I am considering consistent with the child's receptive 
and expressive abilities? 

d. Am I placing an overdependence on one technique and overlooking others that 
may be more appropriate? 

e. Have I '»chieved a balance between formal and Informal techniques in my 
selection? 

2. If I have selected to use standardized instruments, have 1 considered all of the 
ramifications? : 

a* Am I testing this child simply because I've always used tests in my assess- 
ment procedure? 
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Am 1 administering a particular test simply because it is part of piE 
^ BATTERS? \ 
c. Am I administering a teot because I have been directed to do so by\the 

Administration? 

{}• iJoes^the instrument I've chosen include persons in the standardlzatjion 
sample from this child's cultural group? . 
Are subgrv^^) scores reported in the manual? 

Wer 'ther<^ xarge enough numbers of this child's cultural group in the test 
srmple for me to have any reliance on the norms? 

g. Does the Instrument I have selected assume a universal of experiences 
for all child'^en? 

h, Doe.^ the instrument selected contain iJLlustrations that are raisleadj.ng 
. and/or outdated? 

i* Does the instrument selc^-ted jemploy vocabulary that is colloquial, re- 
gional and/or archaic? 

j. Do 1 understand the thecti;,cical has^'- of th^' instrument? 

k. Will this instrument e^.dily assist in delineating a recommended course 
of action to benefit tnis child? ^ 

1. Have I revip.wed current l:^terature regarding this instrument? 

m. Have I review.^cJ research .elated to potential cultural ^influences on test 
results? 

. TEST ADMINISmXlON ^ ^ s 

1. Are there factors (attitude, physical conditions) which support the m^ed to re-- 
*;chedule this child for evaluation at another time? 

2, CoGld the physical environment of the tes^t setting adversely affect this child's 
performance? ♦ 

- room temperature 

- noise 
inadequ^-^"'^ space 

- poor lighting > . 
^ furnishings inappropf late for child's size 

3, Am I familiar with the test manual and have I followed its directions? 

4. Have I given this child clear directions? / 

a. If his natiye language is not English, ha.;e 1 Inst rue tc>d/hlm in his language? 

b, I suve that this child understands my directions? 

Have I accurately recorded entire responses to test Item^, even though the child's 
, aaswers may be incorr^-ct, so that I might later consider them when interpreting 
his test scores? 

t^/. Did I establish and maintain rapport with this child throughojt the evaluation 
Session? , ^ ' 

SCqRING AND INTERPRETATIO N 

I. Have I examined each item missed by this child raiher than merely looking at his 
tot^l score? 

1. U there a pattern to the types of items this child missed? 
^, Are the items missed free of cultural bias? 

'C. If I, omitted all items missed that are culturally biased, would this child 
have performed significantly better? ^ 
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2, Am I aware that I must consider other factors in the interpretation of this 
child's scores? 

a. Have I considered the effect the child's attitude and/or physical condi- 
tion may have had on his performance? 

Have I considered the effect that the child's lack of rapport with me may j 
have had on his performance? 

c. Does my interpretation of this child's performance include observations? j 

d. Do I realize that I should report and interpret scores within a range 
rather than as a number? 

3, What confidence do I have in this child's test scores? 

a. Are test scores the most important aspect of this child's evaluation? 
b* Will I allow test scores to outweigh my professional judgment about this 
child? 

* ' » 

CONSULTATION WITH TEAM ^ MEMBERS AND OTHERS ^ 

1. Am I work^ ,g as an integral member of a multidisciplinary team. on behalf of this 
child: ^ \ . ' 

a. Have I met with" the team to share my findings regarding .this child? 

b. Are other team member's evaluation results in conflict vith^mine? 

c. Can I admit my discipline 's limitati ons and seek assistance from other 
team irembers? 

d. Do I willingly share my competencies and knowledge with other team mem- 
bers for the benefit of this child? 

e. Has the team arrived at its conclusion as a result of^ team consensus or 
was our decision influenced by the personality and/oy power of an indi- 
vidual team member? ' » / . 

2, Is the multidisciplinary team aware of its limitations'^/ 

a. Are we aware of community resource personnel and agencies that might assist 
us in developing an educational plan f^r this child? do we utilize such 
resources before, during, and after the evaluation? 

b. Do we on the team feel comfortable in including this child's parents in our 
discussions? ' ^ 

ASSESSMENT REPORT 

' ' { 

1, Is my report clearly written and f ree.jxfl-4argon so chat it can be easily under- 
stood by this child, his parents, and teachers? 

2, Does my report answer ti questions asked in the referral? 

3, Are the recomra£.ndations I have made realistic and practical for the child, .school, 
teacher and parents? 

4, Have I provided alternative reconnaendations? 

5, Have I included in iry report a description of any problems uhat I encountered 
and the effects of such during the assessment process? 

INDIVIDUAL EDUCATIONAL PLAN 
. 

1. Are we makiitg this child fit into an establisheC program or are developing 
ar tndividtjilized educational plan apprppriate for this child? 
a. Have we identified this child's strehgths and weaknesses? 
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Have we specified long range goals and iimnediate objectives for this child? 
c. Are we willing to assist the teacher in implementing this child's educa- 

-^lional plan? . . ^ 

d* Have we stated when and how this child's progress will be evaluated and by 
^ whom? ♦ 

FOLLOW UP 

1* \n\at are ray responsibilities after we have written this child's educational plan? 

a. Have I discussed my findings and recommendations with this child's parents 
and explained their due process rights? Have I given the parents a written 
copy of this child's educati ^al plan? 

b. aave I met with those working with this child to discuss the educational 
plan and to assist^ them in implementing its recommendations? 

c. ^ Have I discussed my findings and. recommendations with this child at his 

level of .understanding? 

d. Can I help those working directly with the child to become :nore familiar 
vith this child's social and cultural background? 

e. Have I sought thi:> chilcf's parents' permission for release of any ^confi- 
dential materials to other agencies and professionals? * 

f. Will I periodically review this child's educational plan in regard to his 
actual pragress so that any necessary changes can be made? 

•SOME FINAL THOUGHTS ^ 

1. / Do I believe in the right to an appropriate education for all children? 

2. Would I be comfortable if MY child had been involved in THIS assessment process? 

3. Is there a willingness and desire on my part to actively participate in in-service 
activities that will lead to the further development of my personal and profes- 
sional growth? 
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L Programming and Intervention in the Regular Classroom. 

A. Basic Principle: Prior to referral to special education diagnostic services solutions 
to classroom learning and adjustment problems should be attempted, in the regular 
classroom. 

B. Basic Principle: Various resource personnel, e.g., remedial reading specialists, 
curriculum consultants, counselors, psychologists, speech clinicians and social 
workers, snouldbeavailable to assist teachers in developingeducational procedures 
for meeting the child's needs in the regular classroom^ / 
Considerations: 

1. Are specially trained personnel available to assist classroom teachers a* d do 
these personnel provide assistance to teachers in developing alternative 

, procedures in the regular classroom" ,/ 

2. What changes are made in the regular classroom programs in order to serv^ 
chiljJren with diverse backgrounds and diverse characteristics'^^— ^ 

3. What alterjiati ve mat^riais and approaches, independent^rftpeciaTediication 
exist and have been attempted for children with learning and adjustment 
problems? * 

4. In cases referred to special education services, what evirlence exists to confirm 
that attempts were made to solve the problem within the regular classroom? 
Were special personnel involved? Was an organized plan developed? Was the 
plan implemented? Was the plan given sufficient time to be successful? 

5. Were efforts made to in!%rm parents of the problem and attempted solutions, 
and were parents given an opportunity to contribute to solutions attempted in 
the regular classroom? 

IL Screening and Referral Phase. 

A. Basic Principle: Prior to formal diagnostic procedures, adequate information 
should be obtained which establishes the nature and extent of deviation from 
reasonable expectations, 

Coksid^rations: ' ^ 

1. Is the concern related to classroom learning or adjustment stated or restated 
specifically in behavioral terms rather than in terms of a special education 
category? 

2. Is th^> concern related to current classroom learning or adjustment supported 
and illustrated by descriptive samples of behaviors? 

3. Is consideration^ given to and evidence provided concerning the child's 
strengths within school and in other situations? 

4. Are other sources of information considered systematically? Is this informa- 
tion consistent or inconsistent with the re ^erral? Other sou rces of information 
should include the educatioi.al history (evaluations by previous teachers Pre- 
viou^educational methods and materials used, previous grades), achievement 
test scores, previous evaluations by support personr i. previous and current 
social and emotional patterns of behavior, etc, 

5. Do t.. J above sources of information confirm the need f<3r consideration of 
special education alternatives or does the inform.ation suggest that solutions 
should be attempted within the regular classroom? 



The Guidelines were developed by a committee appointed by 
George Garcia, Director, Urban Education Section of the Iowa 
Department- of Public lustructiorTT Committee members included 
Daniel Reschly, Consultant, George Garcia, Jeff Grimes, Wilbur 
House, Marry Maitre, Pat O'Rourke, and Wayne Mooers. The Guide- 
lines have not been approved officially by any division of the 
Iowa Department of Public Instruction. 
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B. ^asic Principle: Parental involvement shall be obtained in all phases of referral, 
evaluation, and placement. Informed consent and due process procedures should be 
initiated early and followed throughout. 
Considerations: 

1. Are parents infoimed of the reasons for the referral in precise, meamngful 
langauge? 

2. Have all communications been in the primary language of the home? 

3. Does the school use a variety of means to solicit active parental participation in 
all phases of evaluation and staffing?*Are parents informed of their rights to 
examine all relevant records? 

4. Are parents provided with mformation concerning the activities and i<ind of 
decisions anticipated in evaluation and staffing along with estimates of time 
required, and specification of personnel responsible? 

III. Evaluation. 

A. Basic Principle: The evaluation of children referred for special education services 
should be conducted by a multidisciplinary team. 

Considerations: 

1. Is someone assigned the responsibility of coordinating the work of the team 
members including, a) evaluating the referral, b) determining the kind of 
information needed, c) assigning appropriately trained personnel to collect 
the data, d) facilitating communication among the team members? 

2. Are interim procedures established for assisting the child and classroom 
teacher while the evaluation and staffing are conducted? 

B. Banc Principle: MuUifactored Assessment Children should be assessed in all areas 
related to the suspected handicap including where appropriate health, vision, hear- 
ing, adaptive behavior, socioculjural background, emotional status, academic per- 
formance, aptitude (intelligence), language, and psychomotor ability. No single 
procedure such as IQ test results is used as the primary source of information, and 
the assessment procedures are used to identify areas of specific educational need. 
"Testing and evaluation materials and procedures used for the purposes of evalua- 
tionandplacementof handicapped childrenmust be selectedand administered so as 
not to be racially or culturally discriminatory." 

Con^nderations: 

1. Situational As^p^-^ment. Is an assessment of the school or classroom environ- 
ment conducted which includes a behavioral definition of the referral prob- 
lem(s)? Are data collected on the frequency and magnitude of the problem(s), 
and a study made of the antecedent, situational, and consequent conditions 
related to the problem? 

2. Health History. Are data collected on physical/health conditions which may be 
related to the learning problem? This information would include factors such 
-^s developmental history, disease and injury data, sensory data, sensory sta- 
tus. medication(s) used, and nutrition. 

3. Personal and Social Adjustment. Is personal and social adjustment (adaptive 
behaviors) in the home, neighborhood, and broader community evaluated 
using formal and informal data collection procedures? 

4. Personal and Social Adjustment. Is personal and social adjustment (adaptive 
behaviors) in the school setting evaluated with formal and informal data 
collection procedures? 

G. Primary Language. Is the child's primary language dominance determined, 
and are the assessment procedures administered and interpreted in a manner 
consistent with the primary language data? 

6. Social and Cultural Bnrkground. Is the sociocultural background of the child 
' assessed systematically, and are the results of other assessment procedures 

interpreted in light of the sociocultural data? 

7. Educational Achievement: Norm-Referenced. Is educational achievement 
assessed with norm referenced instruments which yield valid informati.n 
concerning the child's current performance in relation to grade level 
expectancies? 

8. Educational Achievement- Criterioa-Referenced. Is educational achievement 
assessed with criterion-referenced instrumentsor devices which provide valid 
information concerning specific skills and deficit areas? 

73 
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9. Aptitude. Is academic aptitude, i.e., general intelligence, assessed with 
appropriate instruments%vailable, consideration given to variations in per- 
formance over different factors of academic aptitude, and results interpreted 
in view of strengths and limitations of such measures? 

10. Psychoeducational Process. Are psychoer^ucation:vl processes and motor skills 
related to learning assessed, and the influence of these factors on^the learning 
or adjustment problem considered (e.g., attention, eye-hand'cpordination, 
language, visuahmotor, visual perception, auditory discdnfilifi^Jon, etc.)? 

11. Other information. Where appropriate, is information from other areas poten- 
tially importantto placement and educational programming considered, e.g., 
career arid vocational interests and aptitudes? 



IV. Staffing. 

A. Basic Principles: Placement decisions should be based upon information from a 



variety of sources (see previous section). Consideration of the information from the 
multifactored assessment should be documented in the staffing report. Placement 
decisions should be made by a group of persons including appropriate professional 
personnel and parents. The least restrictive alternative principle shall guide the 
selection of option for serving children. 
Considerations: 

1. What evidence exists which documents the consideration of a broad variety of 
information, including both strengths and deficits, in determingeducational 
needs and selection of placement options? 

2. Doesthe determination of educatPonal needsand selection of placementoption 
include the contributions of relevant professional personnel and parents? 

3. Are current educational status and educational needs stated precisely and 
supported by data? 

4. Are alternative options considered for meeting these needs including regular 
education v/ith or without support services? 

5. Are special education eligibility recommendations made in conformance with 
the criteria for primary handicapping condition as defined in the Department 
of Public Instruction Special Education Rules and Regulations? 

6. In making the special education eligibility recommendations, did the multi- 
disciplinary team consider a broad variety of information including adaptive 
behavior and sociocultural background? How did this information influence 
the recommendations concerning goals for intervention and placemerjt 
option? 



7. Are a variety of program options considered in view of the information from 
the multifactored assessment? For example, using information on adaptive 
behavior outside of school to choose between special classes and resource 
options for mild mental retardation, 

8. What evidence supports the choice of program option as an appropriate alter- 
native for meeting the child's needs? 

9. Is an interim plan developed and implemented to assist the child in the regular 
classroom until the placement recommendations are carried out? 

10. Do the sper*ial education personnel inform parents of the primary handicap- 
pingcondition (if any) and explain the full range of available alternatives for 
meeting the child's needs? 

11. Do parents contribute to decisions concerning the o))jectives of special educa- 
tion service selected? 

12 Are there provisions for members of the muitidisciplinary staffing team to 
express opinions which disagree with the decision of the majority? Are the 
dissenting opinions in written form expressing the reasons for disagreement? 
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ADAPTIVE BEHAVIOR INSTRUMENTS 

1. American Association on Mental Deficiency Adaptive 'Behavior Scale for Children*' 
and Adults. Order fi.oin AAMD, 5101 Wisconsin Avenue, N.W., Washington, D.C., 
20016. ^ 

2. Piiblic School Version Adaptive Behavior Scale. Order from AAMD, 5101 Wisconsin * 
Avenue, N.W,, Washington, D.C., 20016. 

3. ^ Caraelot Behavioral Checklist. Order from Camelot Behavioral Systems, P.O. Box 

3447, Lawrence, KS, 66044. Also Erdmark Associates, 1329 Northup Way, Belle-^ 
vue, WA, 98005. 

4. Cain-Levine Social Competency Scale. Order from Consulting Psychologists Press, 
577 College Avenue, Palo Alto, CA, 94306. 

5. Balthazar Scales of Adaptive Behavior Consulting Psychologists Press; see above. 

6. Adaptive Behavior Inventory for Children (Part of SOMPA) Psychological Corpora- 
tion, 757 Third Avenue, New York, NY, 10017. 

7. Vineland Social Maturity Scale. Order from American Guidance Services, Publish- 
ers Building, Circle Pines, MN, 55014. 

8. Children's Adaptive Behavior Scale. Order from Humanics, Ltd., P.O. Box 7447, 
Atlanta, GA, 30309. 
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